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PREFACE 

The objective of this book is the discussion and the practical illustration of 
techniques used in applied macroeconometrics. The plural here is as appropri¬ 
ate as ever in that the profession does not currently share a common view on 
the methodology of applied macroeconometrics. The different approaches are 
regarded as ’’alternative”, in fact it is very rare to see a combination of them 
applied by the same authors, it is also very difficult to see a combination of 
them published in the same journal, with the notable exception of the Journal 
of Applied Econometrics. Interestingly, up to the seventies there was consen¬ 
sus, both regarding the theoretical foundation and the empirical specification 
of macroeconometric modelling. The consensus was represented by the Cowles 
Commission approach which broke down in the seventies when it was discov¬ 
ered that this type of models ’’...did not represent the data, ... did not represent 
the theory... were ineffective for practical purposes of forecasting and policy...” 
(Pesaran and Smith(1995)). The Cowles Commission approach was then substi¬ 
tuted by a number prominent methods of empirical research: the LSE (London 
School of Economics) approach, the VAR approach, and the intertemporal op- 
timization/Real Business Cycle approach. We shall discuss and illustrate the 
empirical research strategy of these three alternative approaches by interpreting 
them as different proposals to solve problems observed in the Cowles Commis¬ 
sion approach. The presentation of each methodological approach is paired with 
extensive discussions and replications of the relevant empirical work. The bulk 
of empirical illustrations is related to the monetary transmission mechanism by 
considering benchmark data-set for the US and European economies. This choice 
allows us to have a common benchmark to address explicitly the differences in 
questions and answers provided by the different schools of thought. 

Plan of the book 

Our presentation is based on the conviction that the oldest concept in econo¬ 
metrics, namely identification, provides a natural unified framework to discuss 
the collapse of Cowles Commission models and the alternative strategies cur¬ 
rently adopted in applied research. Therefore, in the first part of the book we 
introduce time series models and discuss extensively the importance of identifi¬ 
cation for time-series models. In the second part of the book we illustrate the 
Cowles Commission approach and the LSE, VAR, and intertemporal optimisa¬ 
tion/calibration approaches providing applications. 

Chapter 1 serves the purposes of giving a quick revision of basic economet¬ 
rics and of introducing macroeconometrics. This is done analysing the economic 
problem of convergence and growth by using the data-set and replicating the 
results reported in Mankiw, Romer and Weil(MRW,1992). Traditional issues in 
econometrics, with particular emphasis on the mis-specification problem, are re¬ 
vised using the cross-section data set of the cited article. The importance of 
time series in econometrics is then stressed by illustrating the potential of the 
application of time series methodology to the MRW data-set. 
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Chapter 2 and 3 form the econometric basis for the illustration of all the 
different approaches to time-series. The discussion starts with the problems of 
temporal dependence of time series and its impact on the properties of esti¬ 
mator, the solution provided by asymptotic theory for stationary time-series is 
then discussed. Non-stationarity is introduced as the reason of the impossibility 
of using asymptotic theory to fix dependence. Cointegration is then considered 
as the solution in that it allows to specify a cointegrated VAR as the baseline 
stationary statistical model for non-stationary time-series. On such statistical 
model the fundamental problem of identification is discussed, with a separate 
treatment of identification of long-run equilibrium relationships and short-run 
simultaneous feedback. Within this common statistical framework, we introduce 
the criticisms to Cowles Commission Econometrics and our analysis of the alter¬ 
native econometric methodologies currently used by the profession. Chapter 4 
illustrates the Cowles Commission approach by considering specification, estima¬ 
tion and simulation of a simple IS-LM-AS-AD model fitted to US data. Chapter 
5 illustrates the LSE methodology by introducing the diagnosis of the Cowles 
Commission problems provided within this approach as well as the proposed 
solution. Chapter 6, 7, and 8 apply the same strategy to VAR, Intertemporal 
optimisation GMM approach and calibration. The chapter on calibration would 
not exist in his actual format without the contribution of Marco Maffezzoli, who 
is jointly responsible for all this section. Marco and I have taught jointly the 
advanced econometrics option at the Bocconi’s Master in Economics, the book 
has greatly benefited from this experience, and not only the book. 

Data, programmes and exercises 

Data, programmes and exercises are available from the section of my home- 
page devoted to the book at the following address: 

http: //www.igier. uni-bocconi.it/personal/favero/homepage.htm 

Applications presented in the book are performed by using different packages, 
such as E-VIEWs, PC-GIVE, PC-FIML, RATS and MATLAB. The data are 
provided in the appropriate format for the package used in the application and 
also in EXCEL format, to leave the reader free to experiment with the preferred 
software. 

Exercises are not yet available at the date of publication of the book, but I 
plan to include them in the site as part of a project of continuous updating of the 
book. I really hope that the website will grow over time, also with contributions 
by the readers. 

Acknowledgements 
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many co-authors. My grateful thanks go to Fabio-Cesare Bagliano, Rudi Dorn- 
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saran, Marco Pifferi, Riccardo Rovelli, Giorgio Primiceri, Sunil Sharma, Luigi 
Spaventa, Franco Spinelli, Guido Tabellini. 
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APPLIED MACROECONOMETRICS. AN INTRODUCTION 

1.1 Introduction 

Once upon a time there was consensus both on the theoretical foundations of 
macroeconomics and on the correct approach to macroeconometric modelling 
(see, for example, Pesaran-Smith [7]). Such consensus, which was built around 
the “Cowles Commission” approach to model building, broke down dramatically 
at the beginning of the seventies when it was discovered that “...the models 
did not represent the data...did not represent the theory... were ineffective for 
practical purposes of forecasting and policy...”. The breakdown of consensus has 
been rather spectacular, but, as Faust and Whiteman ([2]) put it “... even more 
impressive are the deep rifts that have emerged over the proper way to tease 
empirical facts from macroeconomic data...” 

This book has the ambitious aim of discussing and illustrating the different 
approaches currently taken by the profession in doing applied macroeconomet¬ 
rics. We concentrate on the (large) subset of macroeconometrics dealing with 
time-series data. It is fair to say that the emergence of the deep rifts on the 
proper way to tease empirical facts from macroeconomic data has been paired 
with a deep awareness of the specificity of time-series data. We shall discuss 
the emergence of a plurality of approaches in macroeconomic modelling, within 
the framework provided by the statistical analysis of time series data. We begin 
our work with this introductory chapter, which reviews the basic in economet¬ 
rics, describes the interaction between theory and data in applied work, and 
illustrates the importance of using time series instead of cross-section data in 
macroeconometrics. 

1.2 From theory to data: the new-classical growth model. 

Consider the Solow model of growth 1 This model takes as given the saving rate 
s, the rate of growth of population n, while technology, A , grows at a constant 
rate g. There are two inputs: capital, if,and labour, L, paid their respective 
marginal productivity. Output, Y, is determined by a Cobb-Douglas function 
with constant returns to scale: 


Y t = K?(A t L t ) 1 - a 0<a<l (1.1) 


-’-The original reference is Solow ([9]). The data and the empirical analysis of this chapter 
replicate the results reported in Mankiw, Romer and Weil ([6]). For an excellent introduction 
to macroeconomic models of growth see Farmer ([!]). 
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L t = L t . 

At = At 


(1 + n) 

(1.2) 

(1 + 9) 

(1.3) 


Note that the number of effective unit of labour grows (approximately) at 
the rate (n + g). The model is built by considering the production function 
together with two accounting identities and an ad hoc relation between savings 
and output. The two accounting identities are: 


St = It (1.4) 


K t =K t ^{l-8)+I t (1.5) 

where /, denotes investment, S',denotes savings and 6 represents the rate of 
depreciation of the capital stock K. (1.4) makes immediately clear that we are 
considering a closed economy with no government sector. 

The relationship between output and saving is determined by assuming a 
constant marginal propensity to save s: 

- = s (1.6) 

Y V 7 

We define as k and y respectively the stock of capital per effective unit of 
labour ( K/AL ) and the level of output per effective unit of labour ( Y/AL ) . By 
using all the equations in the model we have: 


kt (1 + n) (1 + g) — kt-i (1 — S) + s/c“ (1-7) 

Equation (1.7) determines the pattern over time of the stock of capital per 
effective unit of labour. From this relation we can pin down the steady state 
value of k, by setting k* = k t +i for each i: 


k 


—j— y’ 

n + g + S J 


( 1 . 8 ) 


(1.8) makes clear that the steady state k is positively related to the sav¬ 
ing rate and negatively related to the rate of growth of population, the rate of 
technological progress and the rate of depreciation of capital. 
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By substituting (1.8) in the production function and taking logarithms we 
can derive the per capita steady state output as : 

/ Y t * \ a a 

In — = In A 0 + #£ 4- --In (s) - - -In (n-\-g-\-S) (1.9) 

\L t ) 1 — a l — a 

(1.8) makes very specific predictions on the impact on output of the saving 
rate and the rate of depreciation of capital, the rate of technological progress 
and the rate of depreciation of capital. 

It is natural at this stage to raise a question on the empirical support to 
such well specified predictions. Mankiw, Romer and Weil ([6]) choose to test 
the model on data from a cross-section of countries. Such data are available 
in a database constructed by Summers-Heston (1988), which contains series on 
real output, private and government consumption, investment and population 
for virtually all countries in the world, excluding planned economies. The data 
are available at annual frequencies. Mankiw,Romer and Weil concentrate on the 
variables of interest for the period 1960-1985. The rate of growth of population, 
n, is measured by the average rate of growth of population in working age (15- 
64 years old). The rate of savings, s, is measured by the ratio of investment to 
GNP. n and s are averages for the period 1960-1985. y is measured by the log of 
GDP per working age person in 1985. (g + 6) is not directly observable and it is 
assumed constant at a value of 0.05. We concentrate on a sample of 75 countries 
labelled Intermediate by Mankiw, Romer and Weil and obtained considering 
non-oil producers countries with population higher than one million in 1960 and 
reliable data, thus excluding from the sample oil producers (as the bulk of GDP 
for such countries is not value added but extraction of existing resources), small 
countries and countries with low-quality data (receiving a grade of ”D” from 
Summers and Heston). The data are contained, in EXCEL format, in the file 
MRW.XLS. 

Now we have data and we have (1.9), which makes specific, theory-based pre¬ 
dictions, on the relations between variables in our data set, the natural question 
is how we test empirically the Solow model? 

The first point to note is that there is no stochastic structure in (1.9). 
Mankiw, Romer and Weil add a stochastic structure to the data by ignoring 
the difference between Y and Y* and by concentrating on the term A. In fact A 
reflects not only the state of technology but also other factors, such as natural 
resources, climate, institutions, therefore the following specification is adopted 
for A : 


In Aq — a T- 

where a is a constant and £j represents a country-specific shock. (1.9) be¬ 


comes now: 
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O' O' 

In yi = In Ao + gt-\- - -In (s;) - - -In (m 4- g 4- 8) 4- Si (1.10) 

1 — a l — a 

which forms the basis for the empirical investigation. 

1.3 The estimation problem: Ordinary Least Squares 

The basis for the empirical test of the predictions of the Solow’s growth model is 
the estimation of (1.10). Consider the estimation of the following model on our 
sample of 75 countries: 


In yi = (3 0 + f3 1 In (s,) + f3 2 In (n, + g + 6) + £;. (1.11) 

If the Solow model describes correctly the data, then the parameter (3 0 should 
capture the term InAo + gt, which is a constant of the cross-section of data, 
while f3 1 should be equal to and /? 2 should instead take the value of — pzyy 
Therefore, independent information on factor shares could be used to assess the 
magnitude of the estimated coefficient: Mankiw,Romer and Weil claim that data 
on factor shares suggest one-third as a plausible value for a and therefore the 
elasticities of yi with respect to Sj and (rq + g + S) should be respectively 0.5 
and -0.5. Moreover, under the null of the validity of the Solow model, we have a 
testable restrictions on the parameters, namely f3 1 = — (3 2 - 

To illustrate how estimation can be performed, consider the following general 
representation of our model: 


y 


P 



y = X/3 + e 


(yi) 


( Xu X12 . 

• X\k \ 


, x = 



\y N J 


\X N1 X N 2 • 

• %Nk / 

/M 

( 

£l\ 



6 = 


V/W 


\£nJ 


In our case N = 75, k = 3, the vector y contains 75 observations on per 
capita GDP while matrix X is (75 X 3). Note that the first column of X is made 
entirely of ones, the second column contains observations on 
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In (sj), while the third one contains observations on In (rq + g + S). The vec¬ 
tor (3 contains three parameters: a constant and the two elasticities of interest 
in our economic problem. 

The simplest method to derive estimates of the parameters of interest is 
the Ordinary Least Squares (OLS) method. Such method chooses values for 
the unknown parameter to minimize, in some sense, the magnitude of the non¬ 
observable components. Define the following quantity: 

e (/3) = y - X/3 

where e (/3) is a (n X 1) vector . If we treat X/3, as a (conditional) prediction 
for y, then we can consider e (/3) as a forecasting error. The sum of the squared 
errors is then 


S(/3) = e(/3)'e(/3) 

OLS produces an estimator of /3, /3, defined as follows: 

S ( 3 ) = inine(/3)'e(/3) 

Given /3, we can define an associated vector of residual 'e as 

'e = y — X/3. The OLS estimator can be derived by considering the necessary 
and sufficient conditions for /3 to be a unique minimum for S : 

i) X'e = 0 

ii) rank(X) = k 

Condition i) imposes orthogonality between the right-hand side variables on 
the OLS residuals, and ensures that the residual have an average of zero when a 
constant is included among the right-hand side variables (the regressors). Con¬ 
dition ii) requires that the columns of the X matrix are linearly independent: no 
variable in X can be expressed as a linear combination of the other variables in 
X. 

From i) we can derive an expression for the OLS estimates: 

X'e = X' (y - X/3) = X'y - X'X/3 = 0 
3 = (X'X)^X'y 
1.3.1 Properties of the OLS estimates 

We have derived the OLS estimator without any assumption on the statistical 
structure of the data. In fact the statistical structure of the data is not needed 
to derive the estimator but to define its properties. To illustrate such properties 
we refer to the basic concepts of mean and variance of vector variables. 
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Given a generic vector of variables, x 


/ Xl \ 


X = 


y %n J 

we define the mean vector E (x) and the mean matrix of outer products 
E (xx') as follows: 


£(x) = 


( E (x l)\ 


\E (x n ) j 


E (xx') = E 


/ x\ X\X 2 • 

■ XxX n \ 

rp 2 

x 2 • 


\^X n X\ X n X2 • 

• x i y 


( E{x\) E(xiX 2 ) 

• E (*§) 


\E(x n x i) E(x n x 2 ) 


E(x 1 x n )\ 
E(x 2 x n ) 


E{xl) j 


The variance-covariance matrix of x is the defined as follows: 


var (x) = E (x—E (x)) E (x—E (x))' = 

= E (xx') - E (x) E (x)' 

Note that the variance-covariance matrix is symmetric and positive definite, 
by construction. In fact, given an arbitrary A vector of dimension n, we have : 


var (A'x) = A'var (x) A 

The first relevant hypothesis for the derivation of the statistical properties 
of OLS regards the relationship between disturbances and regressors in the esti¬ 
mated equation. This hypothesis is constructed two parts: first it is assumed that 







THE ESTIMATION PROBLEM: ORDINARY LEAST SQUARES 


13 


E (y i | Xi ) = x'/3, this rules out the contemporaneous correlation between residu¬ 
als and regressors (it is therefore valid if there are not omitted variables correlated 
with the regressors), second it is assumed that the components of the available 
sample are independently drawn. The second part of this assumption guaran¬ 
tees the equivalence between E (yj | Xj) = x'/3 and E(yj | Xi, ...x t , ...x„) = x'/3. 
Using vector notation we have 


E(y |X) = X/3 


which can be written equivalently as 


E(e|X)=0 (1.12) 

Note that hypothesis (1.12) is very demanding. In fact, it implies that 

E (e; | xi, ...xq ...x„) = 0 (i = 1, ...n) 

The conditional mean is in general a non-linear function of (xi, ...Xj, ...x„), 
(1.12) requires that such function is a constant of zero. Note that (1.12) requires 
that each regressor is orthogonal not only to the error term associated to the 
same observation (E (x^Ei) = 0 for all k) but also to the error tem associated to 
each other observations (E (xj^Si) = 0 for all j ^ k). This statement is proofed 
by using the properties of conditional expectations. 

Given that E (e | X) = 0 implies, by the Law of Iterated Expectations, that 
E (e) = 0, we have 


Then 


1 ^ | x) | %jk\ — 0 

(1.13) 

^iXjk) — E \E [SiXj'fo | 

(1.14) 

— E \XjfcE (^£i | 

(1.15) 

= 0 

(1.16) 


In the context of the Solow model (1.12) requires that s and n are independent 
from e. Of course, such hypothesis is not going to hold in any time-series models 
when the time-series show some degree of persistence (in practice, always). Think 
of the simplest time-series model for a generic variable y : 


Vt — a o + a iVt-i + Ut¬ 
il is clear that if oq 0, then, although it is true that E (u t \ y t - 1 ) = 0, 
E(w t _i | y t - 1 ) 0 and (1.12)is destroyed, without any omitted variable problem. 
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This explains why we have used a cross-section example in our introductory 
chapter, we shall then complicate the framework to deal properly with time-series 
observations. 

The second hypothesis defines constancy of the conditional variance of shocks: 


E (e'e | X) = a 2 I (1.17) 

where o 2 is a constant independent from X. 

The third hypothesis is the one, already introduced, which guarantees that 
the OLS estimator can be derived: 


rank (X) = k (1.18) 

Under hypotheses (1.12) — (1.18) we can derive the properties of the OLS 
estimator. 

Property 1: unbiasedness 

The conditional expectation (with respect to X) of the OLS estimates is the 
vector of unknown parameters /3 : 


3 = (X'X) _1 X' (X/3 + e) 

= (3+ (X'X) 1 X'e 
E (3 | x) = /3+ (X'X) -1 X'E (e | X) 

= /3 

by hypothesis (1.12). 

Property 2: variance of OLS 

The conditional variance of the OLS estimator is a 2 (X'X) 1 : 

var (3|x) =u((3-/3) (3-/3)' I x) 

= E ((X'x ) -1 X'ee'X (X'X) -1 | x) 

= (X'X) -1 X'E (ee' \ X) X (X'X) -1 
= (X'X) _1 XV 2 /X(X'X) _1 
= a 2 (X'Xr 1 

Property 3: Gauss-Markov theorem 

The OLS estimator is the most efficient in the class of linear unbiased esti¬ 
mators. 

Consider the class of linear estimators: 
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Pl = L y 


This class is defined by the set of matrices ( kxn ) L, which are fixed when 
conditioning upon X. L does not depend on y. Therefore we have: 


E{f3 L \X)=E (LX/3 + Le | X) 

= LX/3 

and LX/3 = /3 only if LX = I&. Such condition is obviously satisfied by the 
OLS estimator, which is obtained by setting L = (X'X) 1 X'. The variance of 
the general estimator in the class of linear unbiased estimators is readily obtained 
as: 


var ( (3 l | X) = E (Lee'L' | X) 

= ct 2 LL'. 

To show that the OLS estimator is the most efficient within this class we 
have to show that the variance of the OLS estimator differ from the variance of 
the generic estimator in the class by to a positive semi-definite matrix. 

To this aim define D = L- (X'X) -1 X'; LX = I requires DX = 0. 

LL' = ((X'X) -1 X' + D) (x (X'X) -1 + D') 

= (X'X) -1 X'X (X'X) -1 + (X'X) -1 X'D' + 

+DX (X'X) 1 +DD' 

= (X'X) -1 +DD' 

from which we have that 

var ( (3 L | X) = var (/3 | x) +ct 2 DD' 

which proves the point; in fact for any given matrix D, not necessarily square, 
the symmetric matrix DD' is positive semidefinite. 

1.4 OLS estimation of the Solow growth model 

The results of the application of OLS to the Solow growth model are reported in 
Table 1. We report point estimates along with standard errors (square roots of 
the elements in the principal diagonal of the variance-covariance matrix of the 
OLS estimates). The Table is based on a regression run by using E-Views and 
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exactly replicates the results in Table 1 of Mankiw, Romer and Weil (1992, p. 
414). 


TABLE 1: The estimation of the Solow model 


Variable 

Coefficient 

Std. Error 

t-ratio 

Prob. 

C 

5.367698 

1.540082 

3.485333 

0.0008 

ln(s) 

1.325353 

0.170611 

7.768281 

0.0000 

ln(n + g + S) 

-2.013390 

0.532830 

-3.778672 

0.0003 

R-squared 0.601703 S.E. of regression 0.609456 



ln(s) : log of savings rate (defined as LNS in MRW.XLS) 


ln(n + g + S) dog of 0.05+ rate of growth of population 

(defined as LNNGD in MRW.XLS)' 


Some consideration on these results are in order. 

First, we have specified a model by deriving it directly from the theory and 
we have estimated it to derive empirical evidence on the validity of the model’s 
prediction. In the light of the adoption of this specific strategy for research, the 
residual of the estimated model could be informative in that they reflect the 
impact of all variables omitted from the chosen specification. The analysis of 
residuals could be revealing on the mis-specification of the estimated model. 

Second, the coefficients have the expected sign but the restriction implied by 
the theory are not exactly satisfied. In fact, the absolute values of the point es¬ 
timates of the two elasticities are different, and their magnitude does not match 
available information on the capital-output ratio. MRW observe that the empir¬ 
ical observation of a capital-output ratio of about one third is consistent with 
an elasticity of the pro-capita output with respect to the saving rate of about 
0.5 and elasticity of the pro-capita output with respect to (n + g + S) of about 
—0.5. The natural question which raises at this point is related to the nature 
of estimated parameters. Given that they are random variable, it seems obvious 
to try and derive their distribution in order to test statistically hypothesis of 
economic interest. An interesting general hypothesis regards the significance of 
the estimated coefficients, while more specific hypothesis are of interest in testing 
the prediction of the theory (in our case f3 1 = —/? 2 ,/?i = 0.5,/? 2 = —0.5). 

1.5 Residual analysis 

Consider the following representation: 

e = y - X/? 

= y — X (X'X) -1 X'y = My 

where M = I„—Q, and Q = X (X'X) -1 X'. The (n X n ) matrices M and Q, 
have the following properties: 
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i) they are symmetric: M'= M, Q = Q; 

ii) they are idempotent: QQ = Q, MM = M; 

iii) MX = 0, MQ = 0, QX = X. 

Note that the OLS projection for y can be written as y = X/3 = Qy, note 
also that e = My, from which we have the known result of orthogonality between 
the OLS residuals and regressors. We also have 

My = MX/3 + Me = Me, given that MX = 0. Therefore we have a very 
well specified relation between the OLS residuals and the errors in the model 
'e = Me, which cannot be used to derive the errors given the residuals as the M 
matrix is not invertible. 

We can re-write the sum of squared residuals as: 

S (/3) = e'e = e'M'Me = e'Me 

s{f>) is an obvious candidate for the construction of an estimate for o 2 . To 

derive an estimate of a 2 from S ^3 ^ the concept of trace is useful. The trace of a 
square matrix is the sum of all elements on its principal diagonal. The following 
properties are relevant: 

i) given any two square matrices A and B,tr (A + B) = tr A + trB; 

ii) given any two matrices A and B,tr (AB) = tr (BA); 

iii) the rank of an idempotent matrix is equal to its trace. 

By using property ii) together with the fact that a scalar coincides with its 
trace we have : 


e'Me =tr (e'Me) = tr (Mee') 


Now we can analyze the expected value of S 



,conditional upon X: 


E (S (/3) | x) = E [trM.ee | X) 

= trE (Mee' | X) 
= trM (Eee 1 | X) 
= a 2 trM 

but, by using properties i) and ii), we have: 


trM = trln-tr (X (X'X) 1 X') 

= n — tr ^X'X(X'X) -1 ^ 

= n — k 

Therefore an unbiased estimate of a 2 is given by S \ / (n — k). 



18 


APPLIED MACROECONOMETRICS. AN INTRODUCTION 


This results allows the construction of the standard errors for the estimated 
OLS parameters reported in the second column of Table 1. 

Using the result of orthogonality between the OLS projections and the OLS 
residuals we can write: 


var (y) = var (y) + var (e) 

from which we can derive the following, residual based, indicator of goodness 
of fit: 


R 2 = var (y) _ var (e) 

var (y) var (y) 

The information contained in the R 2 is to be associated with the information 
contained in the standard error of the regression, which is the squared root of 
the estimated variance of OLS residuals. Note that, when a model is estimated 
in logarithms, the standard error of the regression does not depend on the unit 
of measures in which the variables are expressed. In fact, we have: 


e = log (y) - log (y) 

= log(|)=log(l + ^ 


y -y 
y 


When the models are not specified in logs, standard errors are usually in- 
tepreted by dividing them by the mean of the dependent variable. 


1.6 Elements of distribution theory 

We consider the distribution of a generic n-dimensional vector x, together 
with the derived distribution of the vector y = g (x) which admits inverse x = 
h (y) ,with h = g -1 . If prob(a;i < x < X 2 ) = f X2 f ( x ) dx , and prob(j/i < y < 1 / 2 ) = 
fy[ 2 f* (y) dy, then 


/* (y) = f(h (y)) J 


where J = 


dh\ 
dy 1 ' 

dh n 
" dyi 


9h 


dh\ 

dh n 


dy' 

dy n ' 

" dyn 




1.6.1 The normal distribution 

The standardized normal univariate has the following distribution: 
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/(s) = _U exp (_iT) 

E (z) = 0 ,var (z) = 1 

By considering the transformation x = oz + /x, we derive the distribution of 
the univariate normal as: 


f ( x ) = 


1 


^ 6XP \ 2a2 j 

E (z) = fi, var (z) = a 2 

Consider now the vector z = (zi t Z 2 ---Z n ) , such that 


(x — 


f (z) = Rf (a) = (2tt) 2 exp 
i=1 


z is ,by construction, a vector of normal independent variables with zero mean 
and identity variance-covariance matrix. The conventional notation is z ~ N (0, /„). 
Consider now the following linear transformation 


x = Az + /x 

where A is an (nxn )invertible matrix. We consider the following transforma¬ 
tion z = A -1 (x — fi) with Jacobian J = |A _1 | = By applying the formula 
for the transformation of variable, we have: 

/(x) = | A -1 1 exp (x- /x)' A _1 'A _1 (x - /x)^ 

which, by defining the positive definite matrix E = AA^can be re-written as 
follows: 


/(x) = (2tt) 2 E 2 exp (x- /x)'E 1 (x - /x) 


(1.19) 


The conventional notation for the multivariate normal is x ~ N (/x, E). An 
useful theorem is related to the multivariate normal: 

Theorem 1.1 For any x ~ N (/x, E) given any ( mxn ) B matrix and any (mxl) 
vector d ,if y = Bx + d, then y ~ N (B/x + d, BEB') . 
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Consider partitioning a n-variate normal vector in two sub-vector of dimen¬ 
sions n\ and n — n\ as follows: 


(ShKU) •(£:£))• 

By applying the above theorem, we obtain the two following results: 

i) Xi ~ N (p-i, Hu) , which is obtained by applying the theorem in the case 
d = 0, B = (I ni 0) ; 

ii) (xi | x 2 ) ~ N (/x x +E12E22 1 (x 2 — /x 2 ) , Sn - E12E22 1 E21) , which is ob¬ 
tained by applying the theorem in the case d = E^E^x^B = (l ni — E 12 E 22 1 ) • 

Results ii) shows clearly that absence of correlation is equivalent to inde¬ 
pendence within the framework of a multivariate normal. This result is justified 
by the fact that the normal distribution is entirely described by its first two 
moments. 

1.6.2 Distributions derived from, the normal 

Consider z ~ N (0, /„) , an n-variate standard normal . The distribution of w = 
z'z is defined as a y 2 with n degrees of freedom. Consider now two vectors Zi 
and z 2 respectively of dimension n\ and n 2 with the following distribution: 



Then we have uq = zjzi ~ y 2 (ni) , uj 2 = z^z 2 ~ y 2 (n 2 ) , and uq + u) 2 = 
zjzi +Z 2 Z 2 ~ y 2 (ni + n 2 ) , in general the sum of two independent y 2 is in itself 
distributed as y 2 with a number of degrees of freedom equal to the sum of the 
degrees of freedom of the two y 2 . 

From our discussion of the multivariate normal it follows that if x ~ N (/x, E) , 
then (x — n)' E _1 (x — /x) ~ y 2 (n). 

A related result establish that if z ~ N (0,/„) and M is a symmetric idem- 
potent (n X n) matrix of rank r, then z'Mz ~ y 2 (r). 

Another distribution related to the normal is the F distribution. The F dis¬ 
tribution is obtained as the ratio of two independent y 2 divided by the respective 
degrees of freedom. Given uq ~ y 2 (nf) , and u) 2 ~ y 2 (n 2 ), we have: 


ui/n! 

W 2 /«2 


F(n 1 ,n 2 ). 


The Student’s t distribution is then defined as follows: 


t n = \JF (1, n). 

Another useful result establish that two quadratic forms in the standard 
multivariate normal, z'Mz and z'Qz, are independent if MQ = 0. We can finally 
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state the following theorem, which is fundamental to the statistical inference in 
the linear model: 

Theorem 1.2 Ifz ~ N (0 ,/„) with M and Q symmetric and idempotent matri¬ 
ces of respectively of rank r and s and MQ = 0, then we have: z ,^ z ^ ~ F (s, r). 

1.7 Inference in the linear regression model 

In order to perform inference in the linear regression model a further hypothesis 
is needed to specify the distribution of y conditional upon X : 

y | X - N (X/3, a 2 1) (1.20) 

or, equivalently 

u | X-N(0,ct 2 /) (1.21) 

given (1.20) we can immediately derive the distribution of (/3 | x) which, 
being a linear combination of a normal distribution, is also normal: 

(3|X) ^N(/3,a 2 (X'X)- 1 ). (1.22) 

(1.22) constitutes the basis to construct confidence intervals and to perform 
hypothesis testing in the linear regression model. Consider the following expres¬ 
sion: 


/3-/3) X'X (/3-/3) _ u / x ( X 'X) 1 X'X (X'X) 1 X'u 


o* 


o* 


u'Qu 


and, by applying the results derived in the previous section, we know that : 

u'Qu 


o* 


X ~ A 2 (k) 


(1.23) 


(1.23) is not very useful in practice, as we do not know <7 2 . However, we know 


that 


S' 




X u'Mu 


X~ X 2 (T-k). 


(1.24) 


As MQ = 0, we know the distribution of the ratio of (1.23) to (1.24) , more¬ 
over by taking ratio we get rid of the unknown term o 2 : 
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(3 — /3)'x'x(3 — /3) , 

- zl -= (T-k)~kF(k,T-k). (1.25) 

4 u'Mu v v ' v J 

G l 

Result (1.25) can be used by obtaining from the tables of the F distribution 
the critical value (k,T — k) such that: 


prob [F (k,T — k) > F* (k,T — k)] = a 0 < a < 1 

for different values of a we are in the position of evaluating exactly inequality 
of the following form: 

prob | (/3 — (3^ X'X (P~P) < ks 2 F*(k,T - k) j = l-a 

which define confidence intervals for (3 centered upon (3. Hypothesis testing is 
strictly linked to the derivation of confidence interval. When we test hypothesis 
we aim at rejecting the validity of restrictions imposed on the model on the 
basis of the sample evidence. Within this framework our hypothesis (1.12) — 
(1.22) are the maintained hypothesis and the restricted version of the model is 
identified with the null hypothesis Hq. Following the Neyman-Pearson approach 
to hypothesis testing a statistic with known distribution under the null is derived. 
Then the probability of first type error (reject Hq when it is true) is fixed at a. 
For example a test at the level a of the null hypothesis (3 = /3 0 , based on the F- 
statistic, is given when we do not reject the null Hq if /3 0 lies within the confidence 
interval associated to the probability 1 — a. However, in practice, this is not a 
useful way of proceeding, as very rarely the economic hypotheses of interest 
involve a number of restrictions equal to the number of estimated parameters. 
Reconsider the Solow’s growth model: we have three estimated parameters but 
only one restriction. 

The general case of interest to the economist is the one when we have r 
restrictions on the vector of parameters where r < k. If we limit our interest to 
the class of linear restrictions, we can express them as follows: 


Hq = R/3 = r 


where R is a (r X k) matrix of parameters with rank k and r is and (r X 1) 
vector of parameters. To illustrate how R and r are constructed, consider the 
base line case of the Solow model: we want to impose the restriction f3 1 = — /? 2 
on the following specification: 
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In Vi = (3 0 +f3 1 In (s;) + f3 2 In (n; + g + S) + £; (1.26) 


R/3 = r 

(Oil) =(0). 

The distribution of a known statistic under the null can be derived by apply¬ 
ing known results. 

If (j3 | X^ ~ N (j3,a 2 (X'X) ^ , then we have 

^R/3 - r | X^ - N (R/3 - r, a 2 R (X'X) -1 R') (1.27) 

The test is constructed by deriving the distribution of (1.27) under the null 

R/3 - r = 0. 

Given that 


^R/3 — r | xj = R/3 — r + R (X'X) 1 Xu 

under Hq, we have 

(ll/3-rj ^R (X'X) -1 R'^ 1 ^R/3 - 

= u'X (X'X) -1 R' (R (X'X) -1 R') 1 R (X'X) -1 Xu 

= u'Pu 

where P is a symmetric idempotent matrix of rank r, orthogonal to M. 
We have then : 


R/3 - r\ (R(X'X) 1 R') 1 ^R/3 


rF (r, T — k) 


under Hq. 


Which can be used to test relevant hypothesis. We report the application of 
this methodology to our economic case of interest in Table 2. 
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TABLE 2: Testing linear restrictions on equation (1.11) 

F (ni,ri 2 ) Probability 

= ~/3 2 F(l,72) = 1.255627 0.266204 

f3 1 = 0.5, = -0.5 F( 2, 72) = 23.48172 0.000000 

The null hypothesis f3 1 = — fi 2 cannot be rejected for values of a smaller 
than 0.2662, therefore such hypothesis cannot be rejected at the conventional 
five per cent. While the null hypothesis ffi = 0.5 ,(3 2 = —0.5 is rejected at the 
conventional five per cent, and also at the one per cent level. Note that, in Table 1, 
we have already reported the t-values on estimated coefficients, which did reject 
the null hypothesis of the coefficient being equal to zero at conventional critical 
levels. An interesting specific case of the test of the validity of restrictions on 
estimated coefficients are test for the significance of subset of coefficients, which 
we are going to discuss in the next section. 

1.7.1 Testing the significance of subset of coefficients 

In the general framework to test linear restrictions set r = 0,R = [/ r 0] , and 
partition in a corresponding way f3 into [ /3 X (3 2 \ . In this case the restrictions 
R/3 - r = 0 are equivalent to j3 1 = 0 in the partitioned regression model: 

y = Xi/3-^ + X 2 /3 2 + u 

in which partitioning creates two blocks of dimension r and k — r. 

Before proceeding to the discussion of hypothesis testing it useful to derive 

the formula for the OLS estimator in the partitioned regression model. To obtain 

^ / 

such results partition as follows the ’’normal equations” X'X/3 = X y : 



or, equivalently 


XjX x XjX 2 \ (3 1 \ = ( Xjy\ 

X'X, X'X 2y ! \X' 2 y) 

system (1.28) can be resolved in two stages by deriving first an expression 
/3 2 as follows: 

3 2 = (X'X 2 )- 1 (x'y-X'X^) 
and then by substituting it in the first equation of (1.28) to obtain: 
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XiXi/?! + X(X 2 (X'X 2 ) 1 (X' y - X' X^) = X(y 


from which we have 2 : 


3i = (X(M 2 Xi) 1 X(M 2 y 
M 2 = (l-X 2 (X'X 2 )^ 1 X')_ 

Note that, as M 2 is idempotent, we can also write: 

= (X' 1 M' 2 M 2 X 1 f 1 X^M^M 2 y 

and /Sj^can be interpreted as the vector of OLS coefficients of the regression of 
y on the matrix of residuals of the regression of Xi on X 2 . So an OLS regression 
on two regressors is equivalent to two OLS regressions on a single regressor 
(Frisch-Waugh theorem). 

Finally, consider the residuals of the partitioned model: 


u = y - Xift - X 2 /3 2 


U = y - X x 3 - X 2 (X'X 2 ) 1 (x' y - X' Xi/9,) 

u = M 2 y - MaX^ 

= M 2 y - M 2 X! (X(MaXi ) -1 X(M 2 y 
= (M 2 -M 2 X 1 (XiMaX^'^iM^y 

but we already know that u = My therefore it must be that 


M = (M 2 —M 2 X x (X^MaXi) ^(Maj (1.29) 

We are now in the position of reconsider testing for our null of interest. Under 
Ho Xi has no additional explicatory power for y with respect to X 2 , therefore: 

2 Note that the expression for the estimator can be obtained by applying directly on the 
normal equations the formula of the partitioned inverse: 

f A B \ _1 _ f E - EBD - 1 \ 

\CDJ ~ \-D- 1 CE D - 1 + D- 1 CEBD - 1 ) 

E = (A- BD- 1 C )~ 1 
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Ho : y = X 2 /3 2 + u, (u|X 1 ,X 2 )s N (0,a 2 l) . 

Note that the statement 
y = X 2 7 2 + u, (u | X 2 ) ~ N (0, o' 2 I) 

it is always true within our maintained hypotheses. However, in general 7 2 7^ 

/%» 

In order to derive a statistic to test Hq remember that the general matrix 
R (X'X) 1 R' is the upper left block of (X'X) 1 , which we can now write as 
(X' 1 M 2 Xi) . The statistic takes then the following form: 

&(XiMgXQgi 

rs 2 


y'MoX, (X , 1 M 2 X 1 )~ 1 XiM 2 y T-k 
y'My r 


F(T — k, r) 


given (1.29) , (1.28) can be re-written as: 


(1.30) 


y'M 2 y - y'My T-k 
y'My r 


F(T — k,r) 


(1.31) 


where the denominator is the sum of squared residuals in the unconstrained 
model, while the numerator is the difference between the sum of residuals in the 
constrained model and the sum of residuals in the unconstrained model. 

Consider now the limit case in r = 1 and f3 1 is a scalar. In this case the 
C-statistic takes the following form: 



s 2 (XiM 2 X x ) 


s 


F(T — k, r) 


under Hq 


where (X)M 2 Xj ) 1 is now the element (1,1) of the matrix (X'X) 1 . 

Using the result on the relation between the F and the Student’s t distribution 
we have : 


- - -tttt ~t(T — k) under Ho- 

•slXiM-.X,) 1 '- 

Therefore an immediate test of significance of coefficient can be performed, 
as it is done in Table 1, by taking the ratio between each estimated coefficient 
and the associated standard error. 

Let us now reconsider our results on the Solow growth model. We cannot 
reject the null of the validity of the model but the point estimates are rather 
far from the predicted coefficient on the basis of the theory. 
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Two questions naturally rise at this stage. 

First what is the impact on our coefficient of having estimated the model 
without imposing the theoretical restrictions? Second, is it possible to explain 
the discrepancies between the estimated elasticities and the one predicted by the 
theory on the basis of the mis-specification of the model, i.e. of the omission from 
the estimated model of some variables relevant to explain y? 

In the following two sections we take these two questions in turn. 

1.8 Estimation under linear constraints 

In this section we analyze the impact on the OLS estimator of a kind of mis- 
specification deriving from ignoring the existence of constraints on estimated 
parameter. To analyze mis-specification we introduce the difference between the 
estimated model and the Data Generating Process (DGP). 

The estimated model is the linear model analyzed up to now: 


y = X/3 + u 


while the DGP is instead: 


y = X/3 + u 
s.to R/3 — r = 0. 

Where the constraints have been expressed using the so called implicit form. 
A very useful alternative way of expressing constraints, known as the ’’explicit 
form” has been expressed by Sargan (1988): 


/3 = S0 + s 

where S is a (k X (k — r)) matrix of rank k — r and s is a k X 1 vector. 

To show how constraints are specified in the two alternatives let us recon¬ 
sider the restrictions of the Solow growth model f3 1 = — (3 2 on the following 
specification: 


In yi = (3 0 + f3 1 In (s,) + f3 2 In (n, + g + 6) + e; 
Using R/3 — r = 0 we have: 


/ A)\ 

(oil) I Pi ] = (o) 
\02 / 


(1.32) 
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while using (3 = S 6 + s we have 



In practice the constraints in explicit 
the vector of free parameters. Note that 
constraints in explicit form, in our case 
imposed as follows: 



form are written by considering 6 as 
there is no unique way of expressing 
the same constraint could have been 



As the two alternatives are indifferent R/3 - r = 0 is equivalently written as 
RS# + Rs — r = 0 which implies: 

i) RS = 0; 

ii) Rs — r = 0. 

We shall use the explicit form of imposing constraints to derive the Restricted 
Least Squares (RLS) estimators and to evaluate consistency and relative effi¬ 
ciency of OLS and RLS. 

1.8.1 The Restricted Least Squares (RLS) estimator 

To construct RLS substitute the constraint in the original model to obtain: 


y - Xs = XS# + u 

equation (1.33)could be rewritten as : 


(1.33) 


y*=X*# + u (1.34) 

where y* = y — Xs, X* = XS. 

Note that the transformed model features the same residuals with the original 
model; therefore if hypotheses (1.12) — (1.20) hold for the original model, then 
they also hold for the transformed model. So we apply OLS to the transformed 
model to obtain: 


#=(X*'X*) 1 X*'y* (1-35) 

= (S'X'XS) -1 S'X' (y - Xs) 

from (1.35) the RLS estimation is easily obtained by applying the tranforma- 

^rls ^ 

tion j3 = S # + s. Similarly the variance of the RLS estimator is easily obtained 
as follows: 
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var (d | x) = o' 2 (X*'X*) 1 = ct 2 (S'X'XS) 1 

var 3 | X^ = var (sd + s | X^ 

= S var (d | x) S' 

= <7 2 S(S'X'XS) _1 S'. 

We report in Table 3 the RLS estimator of the Solow growth model. By 
comparing Table 1 and Table 3 we note that estimated coefficients are very close 
but estimates in Table 3 are more precise. We also note that the hypothesis 
f3 1 = 0.5,/? 2 = —0.5 is still rejected, despite our imposition of the theory-based 
constraint on our estimated coefficients. 


TABLE 3: The estimation of the constrained Solow model 


Variable 

Coefficient 

Std. Error 

t-ratio 

Prob. 

C 

7.0857 

0.1453 

48.65 

0.0000 

lns-lnngd 

1.43687 

0.13882 

10.35 

0.0000 

R-squared 0.594 S.E. of regression 0.61 



This observation leads naturally to the discussion of the properties of OLS 
and RLS in the case of a DGP with constraints. 

unbiasedness 

This is easy, under the assumed DGP they are both unbiased as such prop¬ 
erties depend on the validity of hypotheses (1.12) — (1.20) , which is not affected 
by the imposition of constraints on parameters. 

efficiency 

On this issue we note immediately that if we interpret RLS ad the OLS 
estimator on the transformed model (1.35)we immediately derive the results 
that the RLS is the most efficient estimator as the hypotheses for the validity 
of the Gauss-Markov theorem are satisfied when OLS is applied to (1.35). Note 
that, by posing L = (X'X) X' in the context of the transformed model, we do 
not in general obtain OLS but we obtain an estimator whose conditional variance 
with respect to X, coincides with the conditional variance of the OLS estimator. 

We support this intuition with a formal argument by showing that the dif¬ 
ference between the variance of the OLS estimator and the variance of the RLS 
estimator is a positive semi-definite matrix. 
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var (3|x) — var (j3 HS | X) = a 2 (X'X) 1 - ct 2 S (S'X'XS) 1 S' 
Define A as follows: 

A = (X'X) -1 - S (S'X'XS) -1 S' 


given that 


AX'XA= ((X'X) 1 -S(S'X'XS) 1 S') X'X ((X'X) 1 - S (S'X'XS) 1 S') 

= (X'X) -1 - 2S (S'X'XS) -1 S' + S (S'X'XS) -1 S'S (S'X'XS) -1 s' 

= (X'X) -1 - S (S'X'XS) -1 S' = A 

we have that A is positive semi-definite, being the product of a matrix and 
its transpose. 

The OLS estimator ignores available information and therefore it is less effi¬ 
cient than the RLS estimator. However, there is no difference between the two 
estimators in terms of unbiasedness. 

So far we have evaluated the gains of imposing true restrictions, a related 
interesting exercise is the evaluation of the loss of imposing false restrictions. 

1.9 The effects of mis-specification 

We consider two general cases of mis-specification to evaluate empirically their 
importance within the Solow model. We take first the case of under-parameterization 
(the estimated model omits variables in the DGP) to move on to the case of over¬ 
parameterization (the estimated model includes more variables than the DGP). 

We evaluate the effects of mis-specification on the OLS estimators by using re¬ 
sults from the partitioned regression model. 

1.9.1 Under-parameterisation 

Given the following DGP: 


y — X 1 /3 1 +X2/3 2 +u, 


(1.36) 


for which hypotheses (1.12) — (1.20) hold, the following model is estimated : 


y = X 1 f3 1 +v- 


(1.37) 


Therefore, the OLS estimates are given by the following expression 
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A? = (XiXO 'xjy (1.38) 

while the OLS estimates which would have been obtained by estimation of 
the DGP would have been: 


/^(XiMaXi) 'xjM.y (1.39) 

The estimates in (1.39) are Best Linear Unbiased Estimators by construction, 
while the estimates in (1.38) are biased unless the correlation between Xiand 
X 2 is zero. To show this point consider that: 

A = (XiX,)- 1 (x(y - X(X 2 3 2 ) (1.40) 

= 37 + d 3 2 (i.4i) 

where D is the vector of coefficients in the regression of X 2 on Xi,and /3 2 is 
the OLS estimator obtained by fitting the DGP. 

To provide further interpretation of these results note that if we have: 


E(y\X 1 ,X 2 ) = X 1 f3 1 +X 2 f3 2 
U(Xi| X 2 ) = X x D 


then 


E {y | X x ) = X^+XiD/S, = X lQ . 

Therefore the OLS estimator in the underparameterised model is a biased 
estimator of f3 1 , but it is an unbiased estimator of a.. Then, if the objective 
of the model is forecasting and X! is more easily observed than X 2 , than the 
undeparameterised model can be safely used. On the other hand, if the objective 
of the model is to test specific predictions on parameters (as it is the case with 
the Solow’s growth model), than the use of the under-parameterised model will 
deliver biased results. When we are interested in the effect of Xi on y, inde¬ 
pendently from other factors, it is crucial to control for the effects of omitted 
variables. 

1.9.2 Over-parameterization 
Given the following DGP: 


y = X^! + u 


(1.42) 


for which hypotheses (1.12) — (1.20)hold, the following model is estimated : 
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y = X 1 /3 1 +X 2 /3 2 + v. (1.43) 

The OLS estimator of the over-parameterized model is 

37 = (X(M 2 X 1 ) 1 X(M 2 y (1.44) 

while, by estimating the DGP, we obtain: 

3r = (X(X 1 ) _1 X(y. (1.45) 

By substituting y from the DGP it is immediately shown that both estimators 
are unbiased. The difference is now made by the variance. In fact we have : 

var (37 | Xi,X 2 ) =ct 2 (X(M 2 X 1 ) _1 (1.46) 

var (3i I Xr,X 2 ^ = a 2 (X(X x ) 1 (1.47) 


It can be shown that the estimator derived from the correct model is more 
efficient. In fact, the difference between the two variance-covariance matrices is a 
positive semi-definite matrix. To show this result remember that if two matrices 
A and B are positive definite and A — B is positive semi-definite, then also the 
matrix B -1 — A 1 is positive semi-definite. Then we have to show that XjXi — 
X(M 2 Xi is a positive semi-definite matrix. Such result is almost immediately 
shown: 


X(Xi - X(M 2 X! = X) (I — M 2 ) X x 

= X(Q 2 X 1 = X(Q 2 Q 2 X 1 . 

We can then conclude that overparameterization impact on the efficiency of 
estimators and on the power of tests of hypotheses. 

1.10 Human capital in the Solow’s growth model 

Let us reconsider the question of the estimated elasticities in the Solow growth 
model. We have seen that the theory implied restriction on the equality of elas¬ 
ticities cannot be rejected, but that imposing such constraint does not solve the 
problem of the implausibly high values for the point estimates of elasticities. Our 
discussion of the effect of omitted variables on OLS estimation illustrates a po¬ 
tential solution to the problem. MRW follow this lead and point out that human 
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capital could be the relevant omitted variable. To illustrate the impact of human 
capital on the Solow growth model, let us augment our simple specification to 
consider three inputs: physical capital, K, human capital, H and labour, L. By 
keeping a constant-returns-to-scale Cobb-Douglas production function, we have: 

Y t = K?H? {A t L t f- a - p (1.48) 

Define as s^ the fraction of output invested in physical capital and as s j. the 
fraction of output invested in human capital. We maintain all the other original 
assumption in the Solow’s model and we also assume that physical and human 
capital depreciate at the same speed. The evolution of the economy over time is 
now governed by the two following dynamic equations: 


k t (1 + n) (1 + g) — kt ~i (1 — S) + sj-yt-i (1.49) 

h t (1 + n) (1 + g) = h t -i (1 - 6) + s h y t -\. (1.50) 

By assuming a+(3 < 1, we can derive the steady-state of the economy defined 
by the two following relationships: 


1-/3 a \ I-—? 


k* = 


n + g + 6 


(1.51) 


h* = 


„l-a a \ l-a-p 
b k b h 


n + g + 6 


(1.52) 


By substituting these two relationship in the production function and taking 
logs we have an expression for the pro-capita level of output in steady-state: 


O' 

In Ao + gt + -- — In (s&) 4- (1.53) 

1 — a — p 

$ i/\ Cfc H - 

+ WV/Vfl >" M - >“ <" + » + «>• 

(1.52) shows how pro-capita output depends on the rate of growth of popula¬ 
tion, the rate of accumulation of human capital and the rate of accumulation of 
physical capital. (1.52) nests (1.8) , and illustrate how direct estimation of (1.8) 
might deliver biased estimates of the parameters of interest as a consequence of 
under-parameterization. MRW construct a proxy for the rate of accumulation of 
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human capital by merging two data-sets to obtain a measure of the percentage of 
working-age population that is in secondary. They call such variable SCHOOL 
and they include its logarithm in the regression. In this case even if SCHOOL 
is only proportional to Sh ,it can be safely used in the estimation of the equation 
of interest as only the constant will be affected. On the other hand if SCHOOL 
is measured with error, the measurement error will deliver bias in the estimates 
only if it is correlated with the other regressors. 

The results of the estimation of the augmented Solow model are reported in 
Table 4. 


TABLE 4: The estimation of the augmented Solow model 


Variable 

Coefficient 

Std. Error 

t-ratio 

Prob. 

C 

4.451 

1.153 

3.859 

0.0002 

Ins 

0.709 

0.15 

4.725 

0.0000 

lnngd 

-1.497 

0.402 

-3.719 

0.0004 

lsch 

0.728 

0.095 

7.666 

0.0000 

R-squared 0.782 S.E. of regression 0.45 



Note that now all the model-based restrictions on coefficients cannot be re¬ 
jected and that estimated parameters are compatible with values of about 1/3 for 
a and (3. Such values are deemed to be very reasonable by MRW who conclude 
that the estimation of the Solow model without human capital can be considered 
as a benchmark case to illustrate the effect of underparameterization. 

1.11 The importance of time-series in macroeconomics 

Results obtained thus far are based on a cross-sectional analysis of different 
countries, without using information on the time-series behaviuor of relevant 
variables. However, most of the interesting questions in macroeconomics are an¬ 
swered by analyzing the time-series behaviour of variables. The Solow’s model 
predicts that each economies converges to its own steady-state. The obvious 
implication is that, over-time, differences in per capita output of countries fea¬ 
turing the same rate of capital accumulation and the same rate of growth of 
population should disappear. The empirical validity of such prediction has been 
heavily questioned. Recently an alternative theory of growth has developed: the 
endogenous growth theory (Lucas [5], Romer ). Such theory basically modifies 
the new-classical growth model by introducing constant returns to scale in the 
production function for output in effective units of labour (a + (3 = 1, instead 
of a + (3 < 1). In such type of models the steady-state level of output is not 
defined 3 . Therefore, differences between countries can persist indefinitely even if 

3 The non-existence of equilibrium generates some problems for the construction of a theory 
of distribution. These problems are solved by introducing the idea of an aggregate production 
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the countries share the same rates of accumulation of capital and of growth of 
population. 

If one considers time-series data, then it is very easy to discriminate between 
the two models on the basis of their different predictions. Consider the i -th 
country in our sample. If we have t = time series observations on the 

relevant variables , then we can estimate the Aj parameter in the following model: 


Ain y ijt = Aj - Iny*^) + e it (1.54) 

Ain y ijt = (In y ijt - \ny ijt -i) . 

The new-classical growth model predicts Aj < 0, while the endogenous growth 
model predicts Aj > 0. 

In fact Xi < 0 warrants convergence of y to its steady-state (which it is time- 
varying in that the rate of accumulation of capital and the rate of population 
growth might be time-varying). While in the case Aj > 0, we do not have conver¬ 
gence of y to its steady-state. The main complication in estimating and testing 
the model of interest within a time-series context is that the hypothesis E (u | X) 
= 0 does not hold and the derivation of properties of estimators and statistical 
distribution for hypotheses testing requires a new appropriate framework, which 
we will discuss in the following chapter. 

To reinforce our point on the importance of time series in evaluating macroe¬ 
conomic theories we evaluate the loss of information when Aj is estimated by 
using the 75 cross-sectional observation used so far. 

We can re-write(1.54) as follows: 


In Vi,t = (1 + X i )\ny i)t _ 1 + X i \ny* it _ 1 +e it (1.55) 

by recursively substituting in (1.55) ,we have: 


t -1 

In yi,t - In y ifi = - (l - (1 + A;) 4 ) In y ifi - A; lny*^ (1 + Ai) J + (1.56) 

j =o 

t 

(1 + A iY e^t-i 
j=o 

which can be re-written as , 


function different from the production function faced by the specific firm, in fact productivity 
gains at firm levels are not expgenous as they depend on the general level of industrialization 
of the society (theory of the ’’learning by doing”). For a very clear discussion of this point see 
Farmer(1996). 
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In yi,t ~ InJAo = - (l - (1 + A;) 4 ) In y ifi + (l - (1 + A;) 4 ) In y* + v t (1.57) 

t 

Vt = (1 + A iY £j v t-i- (1.58) 

j= o 

Where in all derivation we have taken the steady-state to be constant. Now 
by adding to this assumption Aj = A, constant across countries. It is possible 
to estimate (1.57) on our cross section of 75 countries, by taking as dependent 
variable the difference between initial and final output. MRW do so by taking 
the difference between output in 1985 and output in 1960. Note that the error 
time in the cross-sectional model is much larger then the error term in the time- 
series model, being the cumulation of 25 time series residuals. We report MRW 
in Table 5. Note that we compare three models: the unconditional convergence 
model, and two conditional convergence model (Solow and augmented Solow). 
As expected the results on the estimation of A change as the specification is 
changed. The richer model gives a point estimate for A of —0.02, giving some 
support to the new-classical model (what is the standard-error associated to this 
point estimate?) 


TABLE 5: Testing convergence (dep.var. Iyl85-ly60) 


Variable 

Unconditional 

Solow 

Augmented Solow 

Constant 

0.568 (0.432) 

2.26 (0.847) 

2.48 (0.795) 

lyl60 

-0.002 (0.054) 

-0.23 (0.056) 

-0.36 (0.066) 

Ins 

- 

0.65 (0.103) 

0.55 (0.101) 

lnngd 

- 

-0.45 (0.304) 

-0.54 (0.286) 

lnsch 

- 

- 

0.27 (0.079) 

R 2 

0.00002 

0.38 

0.47 

a 

0.41 

0.32 

0.30 

A 

-0.0078 

-0.01 

-0.018 


1.12 Alternative strategies of research in macroeconometrics 

Some final remarks on the research strategy behind the empirical work considered 
so far can be useful to set out the general framework for the organization of the 
material in this book. The starting point of the research strategy of Mankiw, 
Romer and Weil is a theoretical model, the Solow growth model. The estimated 
empirical relation is derived from the solution of the model. As the estimation of 
the relation explicitly derived from Solow growth model delivers disappointing 
results, a modification of the model is considered by introducing human capital in 
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the original framework. Such modification generates satisfactory empirical results 
and it is capable of explaining the empirical failure of the original specification. 
At this point the authors have their message and are able to convey it to the 
profession. 

Any empirical research strategy is based on the combination of theoretical 
analysis and work on the data to produce models of the economies. We have 
shown that time-series are the most natural empirical counterpart of varaiables in 
macroeconomic models. In the next chapter we discuss the statistical framework 
necessary to analyse time-series. 

We shall then introduce identification; the crucial stage of research in applied 
macroeconometrics where theory and statistical analysis of the data are brought 
together. In fact, the different approaches currently adopted in applied empir¬ 
ical work in macroeconomics could be understood as different solutions to the 
identification problem. On the basis of the working knowledge of fundamentals 
built in these two chapters, we shall then consider the different approaches to 
applied macroeconometrics. We start from the Cowles Commission approach, 
by discussing a model of the monetary transmission mechanism built on the 
most famous ” ad-hoc” framework (the IS-LM model augmented by some supply 
function) imposed on the data to ask the time honoured question ’’what does 
monetary policy do”. Such model is designed to identify the impact of policy vari¬ 
ables on macroeconomic quantities. The objective of the exercise is to determine 
the value to be assigned to the monetary instruments to achieve a given target 
for the macroeconomic variables. Exogeneity of the policy variables is assumed 
on the ground that these are the instruments controlled by the policy maker. We 
illustrate how the model is used by estimating a small odel of the US economy 
and replicate the empirical failure of the generation of macroeconometric models 
in achieving the objective of their simulation. 

Such failure has been rationalized in different ways, leading to different ap¬ 
proaches to replace the Cowles Commission research programme. 

The LSE ([3])approach explains the failure of the Cowles Commission method¬ 
ology by attributing it to the lack of attention for the statistical model underlying 
the particular econometric structure adopted to analyse the effect of alternative 
monetary policies. The LSE methodology considers econometric policy evalua¬ 
tion an interesting and feasible exercise. However, the way in which the Cowles 
Commission approach deals with a legitimate question is not seen as correct. The 
lack of sufficient interest for the statistical model is interpreted as the root of 
the failure of the Cowles Commission approach to provide at acceptable answer 
to an interesting question. The diagnosis is careful diagnostic checking on the 
specification adopted. By applying the LSE approach to the same problem faced 
by the Cowles Commission model we shall show merits and limits. On the posi¬ 
tive side, we evaluate the improvements on the econometric specification, while, 
on the negative side, we show why such methodology has not been universally 
adopted as a unique substitute for the Cowles Commission approach. 

Differently from the LSE explanation of traditional structural modelling the 
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two most famous and demolishing critiques, due to Lucas ([?]) and Sims ([8]), 
concentrate on the weak theoretical basis for the Cowles Commission models. 
The Lucas critique explains the failure of structural models when the coefRcent 
describing the impact of monetary policy on the macroeconomic variables of in¬ 
terest depend on the monetary policy regimes; in this case no model estimated 
under a specific regime can be used to simulate the effects of a different monetary 
policy regime. Such situation is naturally generated when agents behaviour is de¬ 
termined by intertemporal optimization. The Sims critique attacks identification 
from a different perspective, pointing out that the restrictions needed to support 
exogeneity in structural Cowles Commision-type models are “incredible” in an 
environment where agents optimise intertemporally. 

The natural outcome of these two critiques is that policy simulation should 
not be undertaken on the basis of structural econometric models but rather on 
the basis of simulation of model economies based on microeconomic foundations. 

However, econometrics play still an important role for the selection of the 
appropriate model economy and for the estimation of the deep parameters de¬ 
scribing taste and technology and independent from expectations. 

The research programme initiated by Sims lead to the estimation of VAR 
models in empirical macroeconomics. VAR models of the transmission mecha¬ 
nism are not estimated to yield advice on the best monetary policy; they are 
rather estimated to provide empirical evidence on the response of macroeco¬ 
nomic variables to monetary policy impulses in order to discriminate between 
alternative theoretical models of the economy. Monetary policy actions should 
be identified using theory-free restrictions, taking into account the potential en¬ 
dogeneity of policy instruments. 

The Generalised Method of Moments is the econometric methodology natu¬ 
rally applied to the first order conditions for the solution of intertemporal opti¬ 
misation problems to derive estimates of the deep parameters in the economy. 

Once deep parameters of interest are estimated, the micro-founded model can 
be calibrated and the effect of relevant economic policies can then be assessed. 

We shah devote three chapters to VAR models, GMM estimation and calibra¬ 
tion to illustrate the strategy of empirical research in macroeconomics consistent 
with the view that policy advice should be based on the simulation of theoretical 
models considering explicitly the intertemporal optimisation problem of agents. 
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THE PROBABILISTIC STRUCTURE OF TIME SERIES DATA 


2.1 Introduction: what is a time-series ? 

In the previous chapter we have introduced time-series to show that one of the 
fundamental properties necessary to perform valid estimation and inference in 
the linear model is generally violated by time-series. In this chapter we shall 
discuss this issue at greater depth and length by defining precisely time series 
and the fundamental concepts to analyze them, by illustrating how the problem 
introduced can be resolved in the context of stationary time-series, to finally 
extend our discussion to non-stationarity and cointegration. 

We write a time-series as 


{x 1 ,x 2 , ...x T } or {x t } , t = 1, ...T 

where t is an index denoting the period in time in which x occurs. We shall 
treat x t as a random variables, hence a time-series is a sequence of random 
variables ordered in time. Such sequence is known as a stochastic process. The 
probability structure of a sequence of random variables is determined by the 
joint distribution of a stochastic process. 

A possible probability model for such a joint distribution is : 


x t = e t , e t ~ n.i.d. (0,crf) 


i.e. x t is normally independently distributed over time with constant vari¬ 
ance and zero mean. In other words x t is a white-noise process. A white-noise 
process is not a proper model for most macroeconomic time-series because it 
does not feature their most common characteristic, namely persistence. To show 
the point consider the data-set USUK.XLS which contains, in EXCEL format, 
quarterly time series data for nominal and real personal disposable income and 
consumption in the UK and the US over the sample 1959:1-1998:1. The data-set, 
retrieved from DATASTREAM, contains nine variables: 



INTRODUCTION: WHAT IS A TIME-SERIES ? 


41 


Table 1: Dataset USUK.XLS 


ukpdispid 

uspdispid 

uscndurb 

uscndurb 

uscnnondb 

uscnnondd 

uscnservb 

uscnservd 


personal disposable income in the UK at constant 1992 prices 
personal disposable income in the US at constant 1992 prices 
consumption of durable goods in the US at current prices 
consumption of durable goods in the US at constant 1992 prices 
consumption of non-durable goods in the US at current prices 
consumption of non-durable goods in the US at constant 1992 prices 
consumption of services in the US at current prices 
consumption of services in the US at constant 1992 prices 


All series are adjusted for seasonality. To assess the behaviour of an typical 
economic time series against the benchmark of the white-noise process, we have 
imported all series in an E-Views workfile and run the following routine: 

smpl 1959:1 1998:1 
genr lyus=log(uspdispid) 
genr WN= 8.03+0.36*nrnd 
plot WN lyus 

The routine generates the log of US real disposable income and an artificial 
series defined as a constant (8.03) plus a normal random variable with zero mean 
and standard deviation of 0.36, where 8.03 and 0.36 are respectively the sample 
mean and the sample standard deviation of lyus. Having generated the series the 
program plots them to obtain the following result: 

Figure 2.1 clearly shows that the white noise model does not capture the 
interesting property of persistence that motivates the study of time series. In 
order to construct more realistic models combinations of e t . We shall concentrate 
on a class of models created by taking linear combinations of white noise, the 
ARM A models: 


AR(l) : 

x t 

MA{ 1) : 

x t 

AR(p ) : 

Xt 

MA(q) : 

x t 

ARMA{p, q) : 

Xt 


pxt-i + e t 
6t + Ott-l 

Pl x t -1 + P2 x t-2 + ••• + Pp x t-p + U 

e t + 8 ie t _i + ... + 9 q e t - q 

Pi x t -i + ••• + Pp x t-p + 9i e t-i + ••• + 9 q e t _ q 


In case it is not already clear, we shall show why ARM A models are obtained 
by taking linear combinations of white noise in the next section, where we discuss 
the strictly necessary fundamentals to analyze time series. 

Note that each of the above models can be easily put to action to generate 
the equivalent time-series by modifying appropriately and running the following 
programme in Eviews, which generates an AR(1) series: 
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Fig. 2.1. A white-noise process and the log of'( IS real disposable income 


smpl 1 1 
genr X=0 
smpl 2 200 

series x=0.5*x(-l) +NRND 

The programme above generates a sample of 200 observations from an AR( 1) 
model with p = 0.5. The series is first initialized for the first observations, the 
command series then generates the series for the specified process, each observa¬ 
tion is 0.5 time the previous observation plus a random disturbance drawn from 
a serially independent standard normal distribution. 

The time series behaviour of the generated X is plotted in Figure 2.2. 

The following modified version of the programme will generate an ARMA( 1,1) 
series: 


smpl 1 1 
genr X=0 
smpl 1 200 
genr u=NRND 
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X 


Fig. 2.2. A stationary ARMA(1,1) process 

smpl 2 200 

series x=0.5*x(-l) +u +0.4*u(-l) 


2.2 Analyzing time-series: the fundamentals. 

To illustrate empirically all the fundamentals we consider an interesting member 
of the the ARMA family: the AR model with drift : 


x t — p 0 + p 1 x t -i + e t (2.1) 

e t ~ n.i.d. (0, 

Given that each realization of our stochastic process is a random variable, the 
first relevant fundamental is the density of each observations. In particular, we 
distinguish between conditional and unconditional densities. Having introduced 
these two concepts we shall define and discuss stationarity, we then generalize 
form our specific member to the whole family of ARMA models, to end this 
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section with a discussion of deterministic and stochastic trends and de-trending 
methods. Note that at this introductory stage we concentrate almost exclusively 
on univariate models. We do so just for the sake of exposition. After the com¬ 
pletion of our introductory tour, we shall concentrate on multivariate models, 
which are the focus of this book. 

2.2.1 Conditional and unconditional densities 

We distinguish between conditional and unconditional density of a time-series. 
The unconditional density is obtained under the hypothesis that no observa¬ 
tion on the time series is available, while conditional densities are based on the 
observation of some realization of the random variables. In the case of the time 
series we derive unconditional by putting ideally ourselves at the moment in time 
preceeding the observation of any realization of the time series. At the moment 
the information set is given only by the knowledge of the process generating 
the observations. As observations become available conditional densities can be 
computed. As distributions are summarized by their moments, let us illustrate 
the difference between conditional and unconditional densities by looking at our 
AR(1) model. 

The moments of the density of x t conditional upon x t -\ are immediately 
obtained from (2.1)as follows: 


E(x t | x t - X ) = Po+PiX t -i 
Var (x t | x t - 1 ) = a 2 e 

Cov [(x t | x t - 1 ) , (x t -j | xt-j-i)] = 0 for each j 


To derive the moments of the density of x t conditional upon Xt- 2 , we need 
to substitute for x t ~\ in terms of x t -2 from (2.1) to obtain: 


E (x t | X t - 2 ) = Po+ PoPi + PlXt -2 
Var (x t | x t - 2 ) = o 2 e (l + p\) 

Cov [(x t | x t - 2 ), (x t -j | x t -j- 2 )] = p x o\ for j = 1 
Cov [(x t | xt- 2 ) , ( x t -j | x t -j- 2 )] =0 for j> 1 


Finally, unconditional moments are derived by substituing recursively from 
(2.1) to express x t as a function of information available at time time to, the 
moment before we start observing realizations of our process. 
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E (x t ) — p 0 (l + Pi + Pi + —p\ X ) + p\xq 
Var (x t ) = a 2 e (l+ p\+ p\ + ...p 2t_2 ) 

7 (j) = Cov (x t ,x t -j) = p{ Var (x t ) 

n( .^ _ Cov (37,37~j) _ p{ Var (x t ) 

y^Var (x t ) Var (x t -i) ^/Var (x t ) Var (x t -i) 

Note that 7 ( 7 ) and p(j) are function of j, known respectively as the auto¬ 
covariance function and the autocorrelation function. 

2 .2.2 Stationarity 

A stochastic process is said to be stricly stationary if its joint density function 
does not depend on time. More formally a stochastic process is stationary if ,for 
each the joint distribution, 


/ { x t, x t+jn x t+j 2 , x t+j n ) 


does not depend on t. 

A stochastic process is said to be covariance stationary if its two first undi¬ 
conditional moments do not depend on time, i.e. if the following relations are 
satisfied for each h,i,j: 


E(x t ) = E(x t+h ) = p 
E (x 2 t ) = E ( x 2 t+h ) = p 2 

C {.Xt+iXtyj ) — Py 

In the case of our AR(1) process the condition for stationarity is that \p x \ < 1. 
In fact, when such condition is satisfied we have: 


E (x t ) = E (x t+h ) = P ° 

1 — Pi 
a 2 

Var (x t ) = Var (x t+h ) = -— e — 

— P 1 

Cov (x t ,x t -j) = p{ Var (x t ) 

on the other hand it easily shown that, when (pj = 1 , the process is non 
stationary. 

In fact we have: 


E (x t ) = p 0 t + x 0 
Var (x t ) = cr 2 t 
Cov (x t ,x t -j) = o 2 ( t-j ) 
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To illustrate graphically the properties of different AR process we generate, 
using the programme in E-views described above, we generate three AR process 
with p 1 set to 0.6 (series XI), 0.8 (series X2), and 1 (series X3) respectively. To 
allow direct comparison we do not include a drift in all process so for all of them 
we have p 0 = 0. The time-series behaviour of the three-processes is reported in 
Figure 2.3. 



Fig. 2.3. First order autoregressive processes with p 1 = 0.6 (XI) , p 1 = 0.8 
(X2), Pl = 1 (X3) 

Note that XI and X2 tend to revert towards their unconditional mean rather 
quickly. The unconditional mean of X3 is also zero but X3 does not show any 
tendency for reverting towards its mean, in fact, as the sample size grows, the 
variance of X3 increases without any bound. 

2.2.3 ARMA processes 

Before introducing the fundamentals of time-series we have asserted that white- 
noise processes were too simplistic to describe economic time series and that a 
closer fit could be obtained by considering combination of white-noises, we have 
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then introduced ARMA models and discussed the fundamentals to understand 
their properties, but we have not yet shown that ARMA models can be considered 
as combination of white-noise processes. The point is shown by considering a 
time-series as a polynomial distributed lag of a white-noise process: 


X t =u t + bxUt-l + b 2 U t -2 + ...b n Ut-n 
= (l + b\L + 62 -^ + ••• + b n L n ) Ut 
= b(L)u t 

where L is the lag operator. The Wald-decomposition theorem, which states 
that any stationary stochastic process could be expressed as the sum of a de¬ 
terministic component and of a stochastic moving average component warrant 
generality of our representation. However in general, to describe successfully a 
time-series, a very high order in the polynomial b(L) is required. This feature 
can be problematic for estimation, given the usual limitations for sample sizes. 
This potential problem if the polynomial b(L) can be represented as the ratio of 
two polynomial of lower order. In this case we have: 


x t =b (L) u t 
a (L) 

= 

c(L) 

c (L) x t = a (L) ut 


( 2 . 2 ) 


(2.2) is an ARMA process. The process is stationary when the roots of c (L) = 
0 lie outside the unit circle. The MA component is said to be invertible when the 
roots of a (L) = 0 lie outside the unit circle. Invertibility of the MA components 
allow to represent it as an autoregressive process. 

To illustrate how the autocovariance and the autocorrelation functions of 
an ARMA model are derived, we consider the simplest case: the ARMA(1,1) 
process: 


Xt — C\Xt-l + 6( + fllTi- 1 
(1 - C\L) x t = (1 + a\L) e t 


(2.3) can be re-written as: 


(2.3) 


x t 


1 a x L 

- 

1 — c\L 

(1 + d\ _L) ^1 + C\L + (ciL) + 6^ 

[l + (&1 + Ci) L + Ci (c-i + Ci) L? + c\ (c-i + Ci) L? + ...J €f 


Then we have 
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Var (x t ) = 1 + (ai + Ci) 2 + c\ (ai + Ci) 2 + ... 


1 + 


(ai + Ci) 

i -4 


o“ 


Cov (x t x t - i) = [(ai + ci) + ci (ai + c x ) + cf (a x + c x ) + ...] o\ 


(ai + Ci) + 


ci (ai +ci) 

1 -c? 


Hence 


_ Cov (x t x t -\) 

^ Var{x t ) 

(1 + fllCi) (fli + Ci) 

1 -\- 2niCi 

Successive values for p (j) are obtained from the recurrence relation p (j) = 
ci p (j ~ 1) for j > 2. 

To illustrate the difference between an AR and an ARMA, we have gener¬ 
ated an AR(0.7) process and an ARMA (0.7, 0.4) process in E-Views. The two 
autocorrelation functions (for lags up to 10) are reported in Table 2. 


TABLE 2: Autocorrelation functions 


AR (0.7) 

ARMA (0.7,0.4) 

0.712 

0.836 

0.561 

0.639 

0.437 

0.491 

0.304 

0.364 

0.254 

0.305 

0.270 

0.305 

0.270 

0.313 

0.298 

0.326 

0.279 

0.323 

0.296 

0.316 


Note that the autocorrelation of the ARMA(1,1) process is higher than the 
autocorrelation of the AR(1) process, this is because ai > 0. 
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2.2.4 Deterministic and Stochastic Trends 

Figure 2.1 at the beginning of this chapter shows that macroeconomic time series, 
beside being persistent, feature (generally) upwarding trends. Non-stationarity 
of time-series is a possible manifestation of a trend. Consider for example the 
random walk with drift: 


x t — a o + x t-1 + e t 
e t ~ n.i.d. (0, erf) 

In this case recursive substitution yields: 

t-i 

x t = x o + dot + 'y ) c t -j (2-4) 

i =0 

which shows that the non-stationary series contains both a deterministic (dot) 
/ 1- 1 \ 

and a stochastic! ^e t _j ) trend. 

\i=o J 

One of the easiest way to make a non-stationary series stationary is by dif¬ 
ferencing it: 


Ax t = x t - x t _i = (1 - L) x t = a 0 + e t 


In general if a time series needs to be differenced k times to be stationary, 
then that series is said to be integrated of order k or I(k). Our random walk is 
1 ( 1 ).When the d — th difference of a time-series x, A d x t , can be represented by 
an ARMA(p, q) model we say that x t is an integrated moving-average process of 
order p, d , q and we denote it as ARIMA(p, d , q). 

It interesting to compare the behaviour of integrated process with that of 
trend stationary process. Trend stationary processes feature only a deterministic 
trend: 


Zt — oi + f3t + e j (2.5) 

The Zt process is non-stationary, but the non-stationary is removed just by 
regressing z t on a deterministic trend. This is not the case for integrated pro¬ 
cesses like (2.4) where the removal of the deterministic trend does not deliver 
a stationary time-series. Deterministic trend have no memory while integrated 
variables have infinite memory. Both integrated variable and deterministic trend 
exhibits systematic variations, but in one case the variation is predictable in the 
other case it is not. This point is easily seen in Figure 2.4 where we report three 
series for a sample of 200 observations. The series are generated in E-views by 
running the following programme: 
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smpl 1 1 
genr ST1=0 
genr ST2=0 
smpl 2 200 

series ST1= 0.1+ST1(-1) +nrnd 
series ST2=0.l+ST2(-l)+nrnd 
series DT= 0.1*@trend +nrnd 


We have a deterministic trend (DT) generated by simulating equation (2.5) 
with a = 0,/? = 0.1, and a white-noise independently distributed as a standard 
normal (nrnd), and two integrated series (ST1 and ST2), which are random 
walks with a drift of 0.1. The only difference between ST1 and ST2 is in the 
realizations from the error terms, which are different drawings from the same 
serially independent standard normal distribution. 



Fig. 2.4. Deterministic (DT) and stochastic (ST1 and ST2) trends 
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2.3 Persistence. A Monte-Carlo experiment 

Persistence of time-series destroys one of the crucial properties to implement 
valid estimation and inference in the linear model. We have already seen that in 
the context of the linear model 


y = X/3 + e 


The following property is required to implement valid estimation aand infer- 
ence 


E(e | X) = 0 


( 2 . 6 ) 


Hypothesis (2.6) implies that 


E (e; | xi, ...x;, ...x„) = 0 (i = 1, ...n) 

Think of the simplest time-series model for a generic variable y : 


Vt — a o + a iVt-i + e t 


It is clear that if oq 0, then, although it is true that E (e t | yt-i) = 0, 
E (e t _i | y t - 1 ) y^ 0 and (2.6)is destroyed. 

The question is how serious is the problem. To assess intuitively the conse¬ 
quence of persistence we construct a small Monte-Carlo simulation on the short 
sample properties of the OLS estimator of the parameters in an AR(1) process. 

A Monte-Carlo simulation is based on the generation of a sample from a 
known Data Generating Process(DGP). A set of random numbers from a given 
distribution is generated first (a normally independent white-noise disturbance 
in our case) for a sample size of interest (in our case 200 observations) and then 
the process of interest is constructed (in our case an AR(1) process). When a 
sample of observations on the process of interest is available, then the relevant 
parameters can be estimated and their fitted value can be compared with the 
known true value. For this reason the Monte-Carlo simulation is a sort of con¬ 
trolled experiment. The only potential problem with this procedure is that the 
set of random numbers drawn is just one possible outcome and the estimates are 
dependent on the sequence of simulated white-noise residuals. To overcome this 
problem in a Monte-Carlo study the DGP is replicated many times. For each 
replication a set of estimates is obtained and then averages across replications of 
the estimated parameters are computed to be assessed against the known true 
values. 

Our Monte-Carlo simulation is performed by running the following programme 
in E-Views: 

genr alsum=0 
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for !i=l to 500 
smpl 1 1 
genr y{!i}-=10 
smpl 2 200 

series y{ ! i}-=l+0.9*y{ ! i}-(-l) +nrnd 

equation eq.ls y{!i}-= c(l)+c (2) *y{ ! i}-(-l) 

eq.rls(c,s) 

genr alsum=alsum+R_c2 

next 

genr almean=alsum/500 

The first line of the programme generate a series to store the values of the 
estimated oq in each replication. In the next step we set a counter to keep track 
of the replications (in the specific case we have 500 of them).The loop for the five 
hundred replications is then set. In each replications a sample of two hundred 
observations from an AR(1) is generated and then the autoregressive parameters 
is estimated. Note that such estimation is performed recursively starting with 
a sample of five observations and then by adding one observation at the time 
until the last one. The series of these estimates is stored at each replications 
with the command eq.rls(c,s). At the end of all replications we have 500 hundred 
series each containing a series of 195 estimated parameters (the first being 
the parameter estimated on the sample 1-5, the second being the parameter 
estimated on the sample 1 - 6 , the last one being the parameter estimated on the 
full sample). We report the average across replications in Figure 2.5. 

From the Figure 2.5 we note that the estimate of oq is heavily biased in small 
samples, but the bias is reduced as the sample gets larger to eventually disappear. 
In fact, it can be shown analytically that the average of the OLS estimate of oq 
is oq (l — 7 ^) . This is an interesting result, which could be generalized. For 
stationary time-series, the correlation, which destroys the orthogonality between 
residuals and regressors in the linear regression model, tends to disappear as the 
distance between observations increases. Therefore, as we shall show in the next 
section, the finite sample results can be extended to time-series by considering 
large samples. Such aim is obtained by introducing asymptotic theory. 

2.4 The traditional solution: asymptotic theory 

Stationary time-series feature time-independent distributions, as a consequence 
the effect of any specific innovation disappear as time elapses. We shall show in 
this section of the intuition given by the simple Monte-Carlo simultaion can be 
extended and asymtptotic theory can be used to perform valid estimation and 
inference when modelling stationary time-series. 

2.4.1 Basic elements of asymptotic theory 

In this section we shall introduce the elements of asymptotyc theory necessary 
to illustrate how all the results in estimation and inference for the linear model 
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Fig. 2.5. Small sample bias 


applied to cross-sectional data in Chapter 1 can be extended to time-series mod¬ 
els 1 . 

Consider a sequence {Xt} of random variables with the associated sequence 
of distribution functions {TV} = F\,...,Ft, we give the following definitions of 
convergence for Xt 

2.4.1.1 Convergence in distribution Given a random variable X with distri¬ 
bution function F, Xt converges in distribution to X if the following equality is 
satisfied: 


lim pr {Xt < -To} = pi' {X < .To} 
T—>oo 

for all Xq ,where the function F(x) is continuous. 


1 For a formal treatment of all these topics see White([60]) 
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2.4.1.2 Convergence in probability Given a random variable X with distribu¬ 
tion function F, Xt converges in probability to X if , for each e > 0,the following 
relation holds: 

lim pr{\X T -X\ < e} = 1 

T—> OO 

Note that convegence in probability implies convergence in distribution. 

2.4.1.3 Central limit theorem (formulation of Lindeberg-Levy) Given a se¬ 
quence {Xt} of identically and independently distributed random variables with 
mean ji and finite variance <7 2 , defining 



o 


lj converges in distribution to a standard normal. 

2.4.1.4 Slutsky’s Theorem For any random variable Xt such that plimXT = 
a, where a is a constant, given a function g( ) continuous in a, we have that 
p\img (X T ) = g (a). 

2.4.1.5 Cramer’s Theorem Given two random variables Xt and Yt such that 
Yt converges in distribution to Y and Xt converges in probability to a constant 
a, the two following relationships hold: 

• Xt + Yt converges in distribution to (a + Y) 

• Yt/clt converges in distribution to (Y/ a) 

• Yt ■ aT converges in distribution to (Y • a) 

Note that all theorems introduced so far are extended to vectors of random 
variables. 

2.4.1.6 Mann-Wald Theorem Consider a vector z t (kxl) of random variables 
which satisfies the following property: 

T 

plimT _1 ^^z t Z( = Q 

i= 1 

where Q is a positive definite matrix. Consider also a sequence e t of random 
variables identically and independently distributed with zero mean and finite 
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variance < 7 2 , for which finite moments of each order are defined. If E (z t e t ) = 
0 ,then we have 


t rr T 

plimT _1 ^jz t e t = 0 


N (0, ct 2 Q) 


2.4.2 Application to models for stationary time-series 
Consider the following time-series model: 


Vt = Q-Vt -1 +/3x t + u t 


where x t is a stationary variable and | ct | < 1. As already shown E ( yput-i ) 7 ^ 0 
and the OLS estimator of a is biased. 

Re-write the model as : 


y t = z t 7 + u t 
z t = [yt- 1 x t ] 


7 = 


a 

p 


By applying the Mann-Wald results we can derive the asymptotic distribution 
of the OLS estimator of 7,7 : 


A N [7, ct 2 Q x ] 

and all the finite sample results available for cross-section can be extended 
to stationary time-series just by considering large-sample theory. 

2.5 Stochastic-trends and spurious regressions 

From what we have discussed so far it should be clear that most econometric 
analysis is based on the variance and covariance among variables. In the case 
of indepedent sampling (cross-section) we can use finite sample moments for 
estimation and inference, in the case of stationary time-series the consideration 
of moments in large samples can solve the problems peculiar to time-series in 
small samples. Within this framework it should be immediately clear that non¬ 
stationary causes problems. In fact, we know unconditional moments are not 
defined for non-stationary time-series. Consider, for the sake of illustration, an 
OLS regression of an 1(0) variable y t on an 1(1) variable x t . The OLS estimator 
of the regression y t on x t converges to zero as the sample size increases, in fact 
the variance of x t , being divergent, dominates the covariance between the two 
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variables. In general asymptotic theory is not applicable to non-stationary time- 
series (see, for example, Hatanaka([23]) and Maddala-Kim([39]). So, unless all 
the trends observed in time-series are deterministic, the solution of reverting to 
asymptotic theory is not directly accessible. 

To give an intuition of the importance of non-stationarity in time-series and 
to illustrate the problems related to non-stationarity, consider the results of a 
’’crazy” regression, obtained by relating the log of consumption in the US to the 
log of personal disposable income in the UK : 


TABLE 3: Regressing US consumption on UK disposable income 
Sample : 1959 : 1 1998 : 1 , Dependent Variable LCUS 


Variable 

Coefficient 

Std. Error 

t-Statistic 

Prob. 

C 

-5.612676 

0.160374 

-34.99740 

0.0000 

LYUK 

1.208592 

0.014419 

83.81657 

0.0000 

R-squared 0.978413, 

S.E. of regression 0.052291, 

DW-stat 0.140469 


Note that the regression features a very high R 2 and the UK disposable 
income is very significant in explaining US consumption. We have a case of a 
spurious regression, which witnesses the relevance of non-stationarity in eco¬ 
nomic time-series. To elaborate on this point consider the two following simple 
univariate time-series models for LYUS and LYUK. 


TABLE 4: Univariate Time-series models for US consumption and UK disposable income 

Coefficient Std. Error t-Statistic Prob. 

Dependent variable LCUS 

C 0.039 0.008 4.91 0.0000 

LCUS(-l) 0.996 0.001 964.9 0.0000 

R-squared 0.999835, S.E. of regr 0.004537, DW stat 1.397403. 

Dependent variable LCUS 

C 0.050 0.049 1.00 0.3185 

LYUK(-l) 0.996 0.004 222 0.0000 

R-squared 0.999835, S.E. of regr 0.004537, DW stat 1.397403. 


despite the simplicity of the two time-series models for LYUS and LYUK, we 
note that they can both be approximated by random walk models: 
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LCUS t = a 0 + LCUSt-i + e lt 
LYUK t = b 0 + LYUK t -i + e 2t 
e u ~ n.i.d. (0,<J 2 ei ) 
e 2t — n.i.d. (0,uf 2 ) 

As we already know, recursive substitution yields: 

t -1 

LCUSt = LCUSo 4- dot 4- ^ ^ €it_ j 

i =0 

i-1 

= LYUKo + b 0 t + 

i =0 

When the following model is estimated 


LCUS t = a + f3LYUK t + u t , 

the coefficient (3 is significant as both series have a deterministic trend. However, 
in order to have a non-spurious relation we would need that the regression re¬ 
moves also the stochastic trend from the dependent variables, leaving stationary 
residuals. If this does not happen, then the correlation we observe can be labelled 
as spurious. We report in Figure 2.6 the residuals from the OLS regression of 
LCUS on LYUK, 

visual impression confirms the intuition that the regression has delivered 
a spurious relation, having not removed the stochastic trend form the non¬ 
stationary dependent variable. The reported DW statistic of 0.14 gives a more 
formal background to the visual impression. In fact the Durbin-Watson statistic, 
originally designed to test for the presence of first order autocorrelation in the 
residuals, can be re-calibrated to test for stationarity. We have 

J2 (u t - «t-i) 2 

DW = --- ~ 2 (1 - p) 

icu t 

i =2 

where 'p is the OLS coefficient from the regression of u t on Utt-i- The test 
was originally tabulated to test the hypothesis Ho : p = 0, but critical values 
for the null of non-stationarity Ho : p = 1, have been provided by Sargan- 
Bhargava([51]). According to such critical values the null of non-stationarity 
cannot be rejected by an observed value of 0.14 for the DW statistic. 
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Fig. 2.6. A spurious regression 


In conclusion we note that non-stationarity of time-series is problematic in 
that it might generate spurious regression and it does not allow the use of stan¬ 
dard large-sample theory for valid estimation and inference in the linear model. 
Before considering the solutions to the problem we shall in the section clarify it 
further by re-illustrting it form a different perspective. 

2.5.1 Non-stationarity and the likelihood function. 

Consider a vector x t containing observations an time series variables at time t. 
A sample of T time series observations on all the variables can be represented 
as follows: 


Xl 



x T 
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In general, estimation is performed by considering the joint sample density func¬ 
tion, known also as the likelilihood function, which we can express as D (Xy | Xo, 0) . 
The likelihood function is defined on the parameters space 0, given the observa¬ 
tion of the observed sample and of a set of initial conditions Xo. Such initial 
conditions can be interpreted as the pre-sample observations on the relevant vari¬ 
ables (which are usually not available). In case of independent observations the 
likelihood function can be written as the product of the density functions for each 
observation. However this is not the relevant case for time-series, as time-series 
observations are in general sequentially correlated. In the case of time-series the 
sample density is then constructed using the concept of sequential conditioning. 
The likelihood function, conditioned with respect to initial conditions, can al¬ 
ways be written as the product of a marginal density and a conditional density 
as follows: 


D (X). | X o ,0) =£>(x 1 | X o ,0)£>(X| | X 1 ,d). 

Obviously we also have 

d (x||x 0 , e) = d (x 2 1 x 1 ,e)D (x| |x 2 , e) 

and, by recursive substitution, we eventually obtain : 

T 

D(x%. I Xo,0) = I \D(*t I X t _!,0). 

t = 1 

Having obtained D (X^ | Xo, 0) we can in theory derive D (X^, 0) by integrating 
with respect to Xq the density conditional on pre-sample observations. In practice 
this could be not tractable analitically as D (Xo) is not known. The hypothesis 
of stationarity becomes crucial at this stage, as stationarity restricts the memory 
of time series and limits to the first observations in the sample the effects of pre¬ 
sample observations. This is the reason why, in the case of stationary processes, 
initial conditions can be simply ignored. Clearly the larger the sample, the better, 
as the weight of the information lost becomes smaller. Moreover note also that, 
even by omitting initial conditions we have: 


T 

D(X%. | Xo,0) =£>( Xl | Xo,0) JJl>(x t | X t _!,0). 

t =2 

Therefore the likelihood function is separated in the product on T — 1 con¬ 
ditional distribution and one unconditional distribution. In the case of non- 
stationarity the unconditional distribution is not defined. On the other hand, 
in the case of stationarity the DGP is completely described by the conditional 
density function D (x t | X t _ i, 0). 
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2.5.1.1 An illustration: the first order autoregressive process To give more em¬ 
pirical content to our case, let us consider again the case of the univariate first 
order autoregressive process. 


x t | X t _i — N <7 2 


(2.7) 


T 

D(X^ | A ,o 2 )=D(x 1 | \,a 2 )Y[D(x t \ X t - U \,a 2 ) . (2.8) 

t =2 

From (2.8) it is clear that the likelihood function involves T — 1 conditional 
densities and one unconditional densities. The conditional densities are given by 
(2.7) , the unconditional density can be derived only in the case of stationarity. 
In fact given : 


X t = \x t -l +u t 
u t ~ N.I.D (0,CT 2 ) , 

we can obtain by recursive substitution: 


Xt — Ut + \ut—i + ... + A™ ^Ui + X n xo. 

And only if |A| < 1, the effect of the initial condition disappear and we can 
write the unconditional density of x t as: 


D (x t | A, a 2 ) = N ^0, ^ ^ ■ 

There under stationarity we can write down the exact likelihood function as: 
D (X). | A, < 7 2 ) = (27r) ~ <7~ t (l — A 2 ) 2 exp 


and estimates of the parameters of interest are derived by maximizing this 
function. Note that A cannot be derived by analytical methods using the exact 
likelihood function,but it requires conditioning the likelihood and operating a 
grid search. Note also that the idea of using in large sample the approximate 
likelihood function by dropping the first observation works only under the hy¬ 
pothesis of stationarity in a large samples. When the first observation is dropped 
and the approximate likelihood fnction is considered, it can be shown analytically 
that the ML estimate of A coincides with the OLS estimate. 


T 


- 2 ^2 ( ( X “ a2 ) X 1 + zJ ( Xt - Xxt - 


t =2 


(2.9) 
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2.6 Univariate decompositions of time-series 

The general solution proposed to the problem introduced in the previous section 
is the search for a stationary representation of non-stationary time-series. This 
has been done both in an univariate and in a multivariate framework. As an 
introduction we shall briefly discuss methodologies used in a uni-variate frame¬ 
work to move swiftly to decompositions in a multivariate framework, which are 
at the heart of our discussion of modern macroeconometrics. 

Beveridge-Nelson (1981) provide an elegant way of decomposing a non-stationary 
time-series into a permanent component and a temporary, cyclical,component by 
applying ARIMA methods. For any non-stationary time-series x t integrated of 
the first order the Wold decomposition theorem could be applied to its first 
difference, to deliver the following representation: 


A Xt — fi + C (T) 
e t ~ n.i.d. (0, <r^) 

where C (L) is a polynomial of order q in the lag operator. Consider now the 
polynomial D (L) defined as follows: 


D{L) = C{L)-C{ 1) 


( 2 . 10 ) 


given that C (1) is a constant, also D (L) will be of order q. It can immediately 
be seen that 


D(1) = 0 


therefore 1 is a root of D (L) , and we can write 


D (L) = C* (L) (1 — L) (2.11) 

where C* (L) is a polynomial of order q — 1. 

By equating (2.10) to (2.11) we have: 


C (L) = C* (L) (1 — L) + C (1) 

and 


A Xt — fi + C* (T) Ae t + C (1) 


( 2 . 12 ) 


by integrating (2.12) we finally have: 
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Xt — C* (A) + /it + C (1) Zt 

= C t +TR t 


where z t is a process for which we have A z t = e t . C t is the cyclical component 
and TR t is the trend component made of a deterministic trend and a stochastic 
trend. Note that the trend component can be represented as follows: 


TRt — TR t -i + /i + C (1) e t . 

2.6.1 Beveridge-Nelson decomposition of an IMA(1,1) process 
Consider the process: 


Ax t = e t + 0e t _i, 0 < 9 < 1. 


In this case we have: 


C (L) = 1 + 6L 


C{1) = 1 + 6 


C* (L) 


C{L)-C{ 1) 
1 -L 


= -9. 


The BN decomposition gives the following result: 


Xt — Ct + TRt 

= —Oct + (1 + 9) Zt . 

2.6.2 Beveridge-Nelson decomposition of an ARIMA(1,1) process 
Consider the process: 


A xt — pAxt-i + £( + 9et~ i 


In this case we have 
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C(L) 
C(l) 
C* ( L ) 


1 + 6L 
1- pL 

i + e 


1-/9 

C(L)-C( 1) 

1 -L 

0 + p 

(1 - p) (1 - pL) 


and the BN decomposition gives the following result: 


Xt — Ct + TRt 

- 9+ P 1+9 

(1 - p) (1 - pL) et + l- p Zt 

2.6.3 Deriving the Beveridge-Nelson decomposition in practice 

The practical derivation of a BN decomposition for any ARIMA process is easily 
derived by applying a methodology suggested by Cuddington and Winters ([6]). 
For any 1(1) process, we have seen that the stochastic trend can be represented 
as follows: 


TR t = TR t _ 1 +p + C{ l)e t (2.13) 

The decomposition can then be applied by the following steps: 

• identify the appropriate ARIMA model and estimate e t and all the param¬ 
eters in p, and C (1) and 

• given an initial values for TRq use (2.13) to generate the permanent com¬ 
ponent of the time-series 

• generate the cyclical component as the difference between the observed 
value in each period and the permanent component 

The above procedure will give the permanent component up to constant, if 
the precision of this procedure isnot satisfactory, one can use further conditions 
to identify more precisely the decomposition. For example one can impose the 
condition that the sample mean of the cyclical component is zero to pin down 
the constant in the permanent component. 

To illustrate how the procedure works in practice we have simulated an 
ARIMA(1,1,1) in E-Views for a sample of 200 observations, by running the 
following programme: 

smpl 1 2 
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genr x=0 
smpl 1 200 
genr u=nrnd 
smpl 3 200 

series x= x(-l) +0.6*x(-l)-0.6*x(-2) +u+0.5*u(-l) 

From the previous section we know the exact BN decomposition of our x t : 


Xt — Ct + TRt 

L1 , 1.5 ~ 

“ “( 1 - 0 . 6 ) ( 1 - 0 . 6 i f 1 + 0A Zt 

TRt = TRt -1 + Q~^ £ t 

we can therefore generate the permanent component of X and the transitory 
component as follows: 

smpl 1 2 
genr p=0 
smpl 3 200 

series TR= TR(-1)+(1.5/0.4)*u 
genr CYCLE=X-TR 

The series X, TR and CYCLE are reported in Figure 2.7. 

This is exactly the procedure that we follow in practice except that we esti¬ 
mate parameters rather than impute them from the known DGP. 

2.6.4 Assessing the Beveridge-Nelson decomposition 

The properties of the permanent and temporary component of an integrated 
time-series delivered by the BN decomposition are worth some comments. The 
innovations in the permanent and the transitory components are perfectly nega¬ 
tively correlated, moreover the trend component is more volatile than the actual 
time series as the negative correlation between the permanent and the transitory 
component acts to smooth the original time-series. These results are easily seen 
for the simplest case we have already discussed. For example in the case of the 
IMA(1,1) process the correlation between the innovations in the permanent and 
transitory component is -E t (1.5e t 0.5e t ) = 1, the variance of the innovation in 
trend component is (1.5) o\ > o\. Note that in general the variance of inno¬ 
vation might have economic interpretation and economic theory might suggests 
different pattern of correlations between innovations from a perfect negative cor¬ 
relation. As we shall see in one of the next chapters, an interesting pattern could 
be the absence of correlation between the innovation in the cycle and trend 
component of an integrated time-series. In general, different restrictions on the 
correlation between the trend and cycle components lead to the identification 
of different stochastic trends for integrated time-series. As a consequence the 
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CYCLE -TR -X 


Fig. 2.7. A Beveridge-Nelson decomposition of an ARIMA(1,1,1) process 


Beveridge-Nelson decomposition is not unique. In general uni-variate decompo¬ 
sitions are not unique. To see this point more explicitly we can compare the BN 
trend with the trend extracted using an alternative technique which has been 
recently very successful in time-series analysis: The Hodrick-Prescott filter. 

Hodrick and Prescott proposed their method to analyze postwar U.S. busi¬ 
ness cycles in a working paper circulated in the early 1980s and published in 
1997([27]). The Hodrick-Prescott (HP) filter computes the permanent compo¬ 
nent TR t of a series x t by minimizing the variance of x t around TR t , subject to 
a penalty that constrains the second difference of TR t . That is, the HP filter is 
derived by minimizing the following expression: 

T T-l 

53 ( x t - TR tf + [( Ti? ‘+1 - TR tf - ( TR t - TRt-if ■ 

(=1 (=2 

The penalty parameter A controls the smoothness of the series, by controlling 
the ratio of the variance of the cyclical component to the variance of the series. 
The larger the A, the smoother the TR t approaches a linear trend as A goes 






66 


(Mil: PROBABILISTIC STRUCTURE OF TIME SERIES DATA 


to infinite. In practical applications A is set to 100 for annual data, 1600 for 
quarterly data and 14400 for monthly data. 

In the following Figure we report the BN trend and the HP trend (with 
A = 100) for the data generated in the previous section. 



Fig. 2.8. Trend components: Hodrick-Prescott versus Beveridge-Nelson 

Note that the BN trend is more volatile than the HP trend. It is possible to 
increase the volatility of the HP trend by reducing the parameter A, however the 
HP filter can reach at most the volatility of the actual time series which, as we 
already know, is smaller than the volatility of the BN trend. 

The HP filter has the advantage of removing the same trend from all time 
series; this might be desirable as some theoretical models, as for example real 
business cycle models, indicate that macroeconomic variables share the same 
stochastic trend. However, it has been shown by Harvey and Jaeger([22])that 
the use of such filter can lead to the identification of spurious cyclical behaviour. 
In fact the two authors above predicate a different approach to modelling time- 
series, known as structural time series modelling, which, we do not consider in 
our analysis as it is not related to macroeconomic models, but certainly merits 
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some attention. 2 

The comparison between the HP and the BN trend reinforces the argument 
of non-uniqueness of univariate decomposition made before, moreover we are left 
with the problem of how to use the filtered series in applied macroeconometrics 
and how to relate them to theoretical models. The empirical counterparts of the¬ 
oretical macroeconomic models are multivariate time-series. Theoretical models 
often predict that different time-series share the same stochastic trend. The nat¬ 
ural question at this point is if it is possible that the problem of non-stationarity 
in time-series could be resolved by considering multivariate models. In this con¬ 
text, stationarity is obtained by considering combination of non-stationary time 
series sharing the same stochastic trends. If such results could be achieved, it 
would be in principle possible to justify the identification of trends by relating 
them to macroeconomic theory. We shall consider this possibility in the next 
sections. 


2.7 Multivariate decompositions and dynamic models 

Let us reconsider our spurious regression for US consumption in the context 
of a dynamic model. We do so by augmenting the static regression to consider 
consumption and income lagged up to one year, i.e. we consider four lags of each 
variables. Results shown over Table 4, witness that the spurious regression result 
disappears: i.e contemporaneous and lagged US disposable income is significant 
in explaining US consumption, while contemporaneous and lagged UK disposable 
income is not. 


2 We refer the interested reader to the work by Andrew Harvey and Augustin Maravall. 
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TABLE 4: A dynamic model for US consumption 



Dependent variable LCUSj, 
Model with US income 

regression by OLS, 1960:1-1998:1 
Model with UK income 

Coefficient 

S.E. 

Coefficient 

S.E 

c 

0.367 

0.106 

0.333 

0.150 

LCUS t _! 

0.987 

0.087 

1.197 

0.083 

LCUS t _ 2 

-0.006 

0.120 

-0.156 

0.131 

LCUS t _ 3 

0.012 

0.121 

0.142 

0.130 

LCUS t _ 4 

-0.172 

0.085 

-0.196 

0.082 

LYUS 

0.258 

0.037 



LYUS t _! 

-0.126 

0.049 



LYUS t _ 2 

-0.068 

0.050 



LYUS t _3 

0.021 

0.049 



LYUS t _ 4 

0.034 

0.042 



LYUK 



0.009 

0.020 

LYUK t _ 4 



0.018 

0.028 

IYUK, 2 



-0.034 

0.028 

lyuk, 3 



-0.0163 

0.028 

LYUK, | 



0.0015 

0.0229 

Trend 

0.00039 

0.0001 

0.00023 

0.0001 

R 2 

0.99 


0.99 


S.E. 

0.0037 


0.0042 


F-test on income 

F(5,155)=10.324 


F(5,155)=1.239 



This is an interesting result which leads to think that, in the case also the 
problems related to non-stationarity could be solved, dynamic multivariate time- 
series models are the right foundation for macroeconometrics. 

2.7.1 Cointegration and Error Correction Models 

To explain why the spurious results disappear when dynamic models are esti¬ 
mated let us consider a simplified version of the dynamic specification estimated 
for consumption: 


c t = a 0 + aiCt-i + a 2 y t + a 3 y t _i + u t (2.14) 

This specification has some interesting dynamic properties which are worth 
discussing. First note that the short-run elasticity of consumption with respect 
to income is different from the long-run elasticity. In fact the short-run elastic¬ 
ity is a 2 while the long-run elasticity is a 1 2 J + t f 3 • The long-run elasticity is found 
by setting all variables in the dynamic model (2.14)to their steady state value 
c t+ i = c, yt+i = c. To see immediately this point consider the following re- 
parameterisation of (2.14): 
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Ac t = a 0 +a 2 Ay t - a (c t _i - 1 ) + u t (2.15) 

a = (1 - ai), /?! = a2 + a3 (2.16) 

1 — oq 

The estimated dynamic model includes both first differences and levels. The 
presence of the level variables generates a long-run solution, derived by setting 
all first differences either to zero (steady-state with no deterministic trend) or 
to a constant (steady-state). Note now the role of the terms in level: we can 
interpret (3-pyt- i as the long-run equilibrium level c* for the log of real consump¬ 
tion c. When a < 0 consumption increases at time t whenever c t _ i < cl_ 1 , 
and decreases whenever c t _i > cJL-pThe system equilibrates in presence of dis¬ 
equilibrium (i.e. a discrepancy between c and c*) such error correction features 
guarantees that in the long-run the consumption will converge to its equilibrium 
value. For this reason the specification (2.15) ,with a < 0, is termed Error Cor¬ 
rection Model. Note that, in the case of an ECM representation, the difference 
between c and c* is a stationary series. This defines co-integration. We say that 
two non stationary variables integrated of order q are cointegrated of order p if 
there exist a linear combination of them which is integrated of order p — q. The 
case p = 1, q = 1, is interesting in that co-integration implies an ECM represen¬ 
tation, which allows to re-write a model in levels, which involves non-stationary 
time-series, as a model which involves only stationary variables. Such variables 
are stationary either because they are first diferences of non-stationary variables 
or because they are stationary linear combination of non-stationary variables 
(cointegrating vectors). 

The inclusion of both differences and levels in the estimated relationship is the 
key factor to the solution of the problems related to non-stationarity of the level 
of variables included in the specification. This solution to the non-stationarity 
problem has also the feature of revealing immediately to the economist the long- 
run properties of the estimated model. To see this point practically we can use 
E-Views to simulate the following bivariate model: 


Act = 0.25A y t - 0.2 (ct-i - y t - 1 ) + 0.003u lt (2.17) 

A y t = 0.02 + 0.009«2t 

where ui t and U 2 t are independently distributed standard normal, the param¬ 
eters are calibrated to reflect the long-run properties of the consumption function 
reported in Table 4. The volatility of the innovations are again calibrated to es¬ 
timated processes on real data for the US economy; income is more volatile than 
consumption. 

To show the properties of the model, we first generate samples for the two 
innovation process, then we generate artificial data for consumption and income 
by constructing the above model and solving it dynamically. We do so for a 
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sample of 100 observations, the simulated series are plotted in Figures 2.9 and 

2 . 10 . 



Fig. 2.9. Two cointegrated series 

Note that the the levels of LC and LY, share a stochastic trend, which 
disappears from the series (LC-LY). The parameter a in the ECM spefication 
determines the speed of adjustment in presence of disequilibrium. To illustrate 
the role of this parameter we report the two series (LC-LY) generated by taking 
the same innovations for the sample 1 200. The innovations are drawn for normal 
independent for all observations, except for observation 101 where the residuals 
in the income process are augmented by 0.036. We have then a shock four stan¬ 
dard deviation away from the mean of the distribution, we can then visually 
inspect the behaviour of the simulated series in the presence of an outlier. The 
process (2.17) is used to generate the first time-series of disequilibria, while the 
second time-series is generated using keeping all parameters unchanged with the 
exception of a, which is trebled to 0.6 from 0.2. The resulting observations for 
disequilibria are reported in the Figure 2.11. 

The disequilibria from the second simulation run are less persitent to witness 
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Fig. 2.10. Disequilibrium 


that the second system feature a fastest speed of adjustment in presence of 
disequilibrium. All the simulated series are contained in the E-Views workfile 
ECM.WF1, with which the reader can experiment to convince herself of the 
properties of Error Correction Models. 

As an application of further interest let us reconsider the static regression in 
the light of our discussion of dynamic models. 

Given the following DGP: 


Ut = aryt-i + a 2 x t + a 3 x t -i + u u (2.18) 

X t = b\X t — i + U 2t 

a static model is estimated by OLS: 


Vt = 7 xt + St 
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Fig. 2.11. Speed of adjustments and disequilibrium 


7 = 


E x tVt 

E*? 


Assess the results of running the static model by taking plimy : 


p lim 7 


p lim 


a i 


Y^Xtyt-i/T 

H^t/T 


4- &2 ^3 


E x t .x t -i/T 

Ztf/T 


E x t :iit/T 

E AIT _ 


under the hypothesis that (2.18) is stationary (bi < 1 ) we can substitute for 
x t in terms of x t -\ and u- 2 t and apply Slutsky’s and Cramer’s theorem to derive 
the following result: 


p lim 7 = 


Cl 2 + ci^bi 
1 — aibi 


a -2 < p lim 7 < 


a 2 + a 3 


1 — Cli 
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Note that as b\ approaches 0 the elasticy of y with respect to x delivered by 
the static regression goes asymptotically to the true short-run elasticity, while 
as bi approaches to 1 such elasticity converges to the long-run elasticity. Tech¬ 
nically speaking we cannot show what happens when bi is one because this 
violates the stationarity conditions which we have used to derive the asymp¬ 
totic behaviour of the OLS estimator. However, confirming the above intuition, 
Stock([55]) has shown that the OLS estimator of the parameters determining 
the long-run relationship non-stationary cointegrated series is super-consistent. 
In fact it converges towards the true value at speed (A) ,higher than the speed 

of much ,with which the OLS estimator converges to its true value in re¬ 

gression between stationary time series. This result has given some background 
to a two-step research strategy, according to which the cointegrating relations is 
estimated first in static model and the used to estimate a dynamic ECM model, 
involving only stationary variables. This strategy is less efficient than the simul¬ 
taneous estimation of short-run and long-run dynamics. In fact the static regres¬ 
sion delivers super-consitent estimates of the cointegrating parameter despite 
being mis-specified, because the omitted variables are the stationary variables 
determining the short-run dynamics, which, in large-samples, should not affect 
the estimation of cointegrating parameters. It has been shown through Monte- 
Carlo simulation that the dimension of the samples required to appeal to the 
super-consitency theorem are much higher than the dimension of the samples 
usually available for time-series modelling (see, for example, [2],[3], [4]). More¬ 
over the empirical counterpart of macroeconomic models are usually dynamic 
multivariate time-series models. Therefore, there must be a price to be paid in 
considering static uni-variate models as a basis for empirical work. We shall 
devote more attention to this issue in the next section. 

2.7.2 Cointegration in a multivariate framework 

So far we have stressed the importance of the magnitude of the adjustment 
parameter a as the relevant discriminant to decide on cointegration, but we 
have not yet provided a statistical framework to test for such an hypothesis.We 
also mentioned the importance of dimensionality of the system to be considered 
in empirical work. In this section we shall elaborate on these points and illustrate 
the Johansen’s([30], [34]) approach to cointegration in a multivariate framework. 

So far we have considered cointegration in a bi-variate context. Things differ 
when we consider a multivariate context. In fact , in general between n non- 
stationary series we can have up to n — 1 cointegrating vector and the single 
equation dynamic modelling can cause serious troubles when there are multi¬ 
ple cointegrating vectors. To illustrate the problem let us consider the case of an 
investigator who uses cointegration techniques to investigate money demand. Fol¬ 
lowing the standard economic background to empirical investigations of money 
demand (see, for example, Hendry and Ericsson,[26]) the chosen data set includes 
money, m, a price index, p, real income, y, the own interest rate on money, i? m , 
and the opportunity cost of holding money, R b . All variables are in logarithms, 
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with the exception of interest rates. The investigator specifies a dynamic single¬ 
equation model for real money, towards the identification of a money demand 
equation, which takes the following shape: 


(m - p) t = a 0 + oq (m - p) t _ 1 + a 2 y t -i + a?,y t -2 + (2.19) 

TU 4 R b _^ T- Ct^R^_2 T T (1'jR^_2 T Uf 

This statistical model fits the data well. As it is found that oq < 1, the 
investigation leads to the identification of a long-run equilibrium money demand, 
which results clearly form the ECM re-parametrization of the dynamic model 
(2.19) : 


A (m — p) t 


[m — p)l_i 


a 0 ~ a 3 Ay t -i - a 5 AR™_ 1 - a 6 AR b _ 1 + 
(oq - 1 ) [(m - p) t _ 1 - (m - + u t 

a 2 + ao cn + as a$ + a 7 b 


(2.20) 


However, the good fit of the statistical model might be combined with an 
incorrect identification of the long-run solution. Think, for example, of the case in 
which the non-stationary vector containing the five variables of interests admits 
two cointegration relationships:(m — p — y) and (i? m — /? 2 2 -R 6 ) . Where the first 
one is generated by the stationarity of the velocity of circulation of money and 
the second one by the behaviour of the banking sector, which sets the interest 
rate on money as a mark-down on the opportunity cost of holding money. In the 
short-run money reacts to disequilibria with respect to both long-run solutions, 
hence money demand is correctly parametrised as follows: 


A(m — p) t = 7T 0 +7TiAy t _i +'k 2 AR™_ 1 +Tr 3 AR b t _ 1 (2.21) 

—oq (m t -1 - Pt-i ~ yt-i) + «2 (RT-i ~ f^22 R t-i) + u t 

Note that the statistical specification of (2.15) and (2.21) is identical, in 
fact the residuals u t are the same, however identification is very different. In 
fact when (2.21) represents the correct specification, (2.15) identifies as long-run 
elasticities what in fact are mixtures of cointegrating parameters and parameters 
determining the speed of adjustment with respect to disequilibria in the true 
model. Single-equation approach leads to believe that the long-run elasticity 
of money demand with respect to the opportunity cost of holding money is 
°i-ai ) w hile in fact such estimated coefficient is a convolution of the parameter 
0 : 2 , determining the speed with which money demand reacts to a disalignment 
of interest rates with respect to their equilibrium value, and the parameter c, 
determining the mark-down of the own interest rate on money with respect to the 
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interest rate on the opportunity cost of holding money. This identification has 
serious consequences in the intepretation of estimated parameters. In fact when 
the above problem occurs a structural instability in the short-term adjustment 
parameter /? 22 would mis-lead the researcher into the belief that long-run money 
demand is unstable. 

The solution of this identification problem requires a framework to allow the 
researcher to find the number of cointegrating vectors among a set of variables 
and to identify them. The procedure proposed by Johansen([30], [32]) within the 
framework of the Vector Autoregressive Model achieves both results. 

2.7.3 The Johansen Procedure 

To illustrate the procedure proposed by Johansen consider the multivariate gen¬ 
eralisation of the single-equation dynamic model discussed so far, i.e. a Vec¬ 
tor Autoregressive Model (VAR) for the vector of, possibly non-stationary, m- 
variables y: 


y t — Aiy t _i + A 2 y t -2 + ••• + A„y t _„ + u t (2.22) 

by proceeding in the same way we did for the simple single-equation dynamic 
model, we can reparameterise the VAR in levels as a model involving levels and 
first-differences of variables. 

Start by subtracting y t -i from both sides of the VAR to obtain: 

Ay t = (Ai — I) yt_i + A 2 y t - 2 + ••• + A„y t _„ + u t (2.23) 

now subtract (A! — I) y t _ 2 from both sides to obtain: 

Ayt = (Ai — I) Ay t _i + (Ai + A 2 — I) yt_ 2 + ... + A„y t _„ + u t (2.24) 

By iterating this procedure until n-1, we end up with the following specifica¬ 
tion: 


Ayt — IIi Ay t -i + IIiAyt- 2 + ... + ny t _„ + u t (2.25) 

n—1 

= ^IliAyt-i + ny t _„ + u t (2.26) 

i= 1 

where : 






76 


THE PROBABILISTIC STRUCTURE OF TIME SERIES DATA 


Clearly the long-run properties of the system are described by the properties 
of the matrix II. There are three cases of interest: 

• rank (II) = 0. The system is non-stationary, with no cointegration between 
the variables considered. This is the only case in which non-stationarity is 
correctly removed just by taking first difference of the variables considered 

• rank (II) = m, full. The system is stationary. 

• rank (II) = k < m. The system is non stationary but there are k cointe¬ 
grating relationships among the considered variables. In this case we have 
II = ot(3', where a is an (m X k) matrix of weights and /3 is an (m X k) 
matrix of parameters determining the cointegrating relationships. 

Therefore, the rank of II is crucial in determining the number of cointegrating 
vectors. The Johansen procedure is based on the fact that the rank of a matrix 
is equal to the number of its characteristic roots that differ from zero. Here is 
the intuition on how the tests can be constructed. Having obtained estimates 
for the parameters in the n matrix, we associate to them estimates for the 
m characteristic roots and we order them as follows Ai > A 2 > ... > A m . If the 
variables are not cointegrated, then the rank of n is zero and all the characteristic 
roots will be zero. In this case each of the expression ln(l — \f) will be zero. If 
instead the rank of n is one, and 0 < Ai < 1 , then ln(l — Ai) will be negative 
and ln(l — A 2 ) =ln(l — A 3 ) = ... =ln(l — A m ) = 0. Johansen derives a test on 
the number of characteristic roots that are different from zero by considering the 
two following statistics: 


m 

A trace (*) = ~T ^ In (l - Ai) 

i=k+1 


A max (k,k + 1 ) — —T In ^1 — Aj, + i) 

where T is the number of observations used to estimate the VAR. The first 
statistic test the null of at most k cointegrating vectors against a generic al¬ 
ternative. The test should be run in sequence starting from the null of at most 
zero cointegrating vectors up to the case of at most m cointegrating vectors. 
The second statistic tests the null of at most k cointegrating vectors against the 
alternative of at most k + 1 coitegrating vectors. Both statistics will be small 
under the null hypothesis. Critical values are tabulated by Johansen and they 
depend on the number of non-stationary component under the null and on the 
specification of the deterministic component of the VAR. Johansen has shown 
in the past ([33]) some preference for the trace test on the argument that the 
maximum eigenvalue test does not give rise to a coherent testing strategy. 

To illustrate briefly the intuition behind the procedure, consider the VAR 
representation of our simple dynamic model (2.18) , introduced in one of the 
previous sections, for the two variables x and y: 
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Vt 

x t 


flll ^12 I I Ut-1 

0 1/ l x t -\ 


u It 
U2t 


(2.27) 


(2.27) can be reparameterised as follows in terms of the VECM representa¬ 
tion: 


f Ay t \ f an - 1 a 12 \ f y t -i\ f u u \ , . 

W ( 0 0 JU_,J + (., 2 ,J <228) 

from which it is clear that 

n =(° , V 1 T)'“=(“V 1 ) ,fl ' =(1 ^ ) 

To expand on this intuition let us reconsider our example on money demand 
from the previous section. 

The baseline VAR could be specified as follows: 


(m — p) t 


" (m — p) t -i~ 


" (m — p) t _ 
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Which could then be reparameterised in VECM form: 
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Given that we know that there are two cointegrating vectors, we have: 


n = af3' 
rank II = 2 

1-10 0 
0 0 l-/? 22 

As we have analysed only one equation in our previous discussion of the 
system, the only constraints we have on the specification for a are an < 0, a 12 > 
0. A possible specification for a would then be: 
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an o: 12 

0 0 

a = 

0 Q?32 

0 0 

with the above specification for the loadings, money demand adjusts both in 
presence of misalignements of velocity with respect to the equilibrium velocity 
and of misalignments of interest rates with respect to their equilibrium spread. In 
particular money demand increases when velocity is ’’too high” and the opportu¬ 
nity cost of holding money is ’’too low”. In case of disequilibrium in interest rates 
it is the interest on money which adjusts, while the dynamics of interest rates 
on the alternative of money in agents’portfolio does not react to disequilibria. 


On OL\2 

0 0 


1-10 0 


On —On O 12 —Oi2f3 22 

0 0 0 0 

0 a 32 

0 0 


0 0 l-/3 22 _ 


0 0 032 — 032f3 22 

0 0 0 0 


2.7.4 Identification of multiple cointegrating vectors 

The Johansen procedure allows to identify the number of cointegrating vectors. 
However, in the case of existence of multiple cointegrating vectors, an interest¬ 
ing identification problem arises. In fact, a and /?, are only determined up to 
the space spanned by them and, for any non-singular matrix £ conformable by 
product, we have: 


n = a(3' = af 1 ^' 

In other words (3 and f3'£ are two observationally equivalent basis of the coin¬ 
tegrating space. The obvious implication is that, before solving such identification 
problem, no meaningful economic interpretation of coefficients in cointegrating 
vectors can be proposed. The solution to such problem is achieved by imposing 
a sufficient number of restrictions on parameters such that the matrix satisfying 
such constraints in the cointegrating space is unique. Such criterion is derived in 
Johansen(1992) and discussed in the work of Johansen-Juselius, Giannini([15]) 
and Hamilton([20]). Given the matrix of cointegrating vectors /3 we can formulate 
linear constraints on the different cointegrating vectors using the Ri matrices of 
dimensions r \ X n. Let us consider the columns of /3, i.e. the parameters in each 
cointegrating vectors, ignoring the normalisation constraint tol of one variable in 
each cointegrating vector. Any structure of linear contraints can be represented 
as follows: 


Rift = 0 

Ri (ri x n ), ft(n x l),rank Ri = ri. 
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The same constraints can be expressed in explicit forms as follows: 

Pi = s l e l 

Si (n x (n - ry)), fifin x 1), 0; ((n - 7 y) x 1), rank S) = n - r h R^S; = 0. 

A necessary and sufficient condition for identification of parameters in the 
i — th cointegrating vectors is the following: 


rank (Rj/3) = r — 1 (2.29) 

when (2.29) is satisfied it is not possible to replicate the cointegrating vector 
i — th by taking linear combinations of the parameters in the other cointegrating 
vectors. In this case the matrix obtained by applying to the cointegrating space 
the restrictions of the i — th cointegrating vectors has rank r — 1 . 

A necessary condition for identification is immediately derived in that Rj/3 
must have enough rows to satisfy condition (2.29) , therefore a necessary con¬ 
dition for identification is that each cointegrating vectors has at least r — 1 
restrictions. 

A sufficient condition for identification is provided by Johansen by considering 
the implicit and explicit form of expressing constraints: 

Theorem 2.1 The i-th cointegrating vector is identified by the constraints Si, S 2 , ...S 
if for each k=l,...,r-l and for each set of indices 1 < ji < ... < jk < r, not con¬ 
taining i, we have that : rank [RiSj 1 , ...RiSj k ] > k 

Given identification of the system we can distinguish the case of just-identification 
and over-identification. In case of over-identification, the over-identifying restric¬ 
tions are testable. 

2.7.4.1 An illustrative example Let us reconsider our example on money de¬ 
mand. Considering the following vectorial representation of the series ( m — p y R m R b 
and leaving aside normalizations, the matrix /3 can be represented as follows: 


( fill 

0 

\ 

—fill 

0 


0 

@32 


l 0 

—@i2 

) 


given the following general representation of the matrix /3 : 


/ P11 £12 \ 

P 21 P22 

@31 ^32 

\/?41 /?42 / 

our constraints imply the following specification for the matrices Ri ans Si : 
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Ri = 


1100 
00 10 
000 1 


,Si = 


/1 \ 

-1 

0 

VO J 


R 9 . = 


1000 
0 100 


,So. = 


/ 00 \ 
00 
1 0 

\01J 


The necessary conditions for identification are obviously satisfied, while the 
sufficient conditions for identification requires: rank(f?iS , 2 ) > 1, rank(f? 2 S'r) > 1. 
They are also satisfied, in fact: 


R1S2 


00 \ 

10 ,r 2 s 1 

01 / 



2.7.5 Hypothesis testing with multiple cointegrating vectors 

The Johansen procedure allows for testing the validity of restricted forms of coin¬ 
tegrating vectors. More precisely, the validity of restrictions in additions(over- 
identifying restrictions) to those necessary to identify the long-run equilibria 
could be tested. The intuition behind the construction of all tests is that when 
there are r cointegrating vectors only these r linear combination of variables are 
stationary, therefore the test statistics involve comparing the number of coin¬ 
tegrating vectors under the null and the alternative hypothesis. Following this 
intuition, we understand immediately why only over-identifying restrictions can 
be tested, in fact just-identified model feature the same long-run matrix II, and 
therefore the same eigenvalues of II. Consider the case of testing restrictions on a 
set r of identified cointegrating vectors stacked in the matrix j3. The test statistic 
involves comparing the number of cointegrating vectors under the null and the 
alternative hypothesis. Let Ai, \ 2y ..., \ r , the ordered eigenvalues of the II matrix 
in the unrestricted model, and A x , A 2 ,..., A r the ordered eigenvalues of II matrix 
in the restricted model, restricitions on /3 are testable by forming the following 
test statistic: 


TJ2 [in (l- A*)- In (l-Ai) 

i=1 


(2.30) 


Johansen ([32]) shows that the statistic (2.30) takes a y 2 distribution with de¬ 
grees of freedom equal to the number of over-identifying restrictions. Note that 
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small values of with respect to Aj imply a reduction of rank of II when the 
restrictions are imposed and hence the rejection of the null hypothesis. This test¬ 
ing procedure can be extended to tests on restrictions on the matrix of weigths 
cx or on the deterministic components (constant and trends) of the cointegrating 
vectors. 

2.7.6 Cointegration and Common Stochastic Trends 

Having discussed the VECM representation for a vector of m non-stationary 
variables admitting k cointegrating relationships, it is interesting to compare 
it with the multivariate extension of the Beveridge-Nelson decomposition. Con¬ 
sider the simple case of an 1(1) vector y t featuring first order dynamics and no 
deterministic components: 


Ayt = ct/3'y t -i + u t (2.31) 

where a is the (m X k) matrix of loadings and /3 is the (m X k) matrix of param¬ 
eters in the cointegrating relationships. When As y t is 1(1), we can apply the 
Wold decomposition theorem to Ay t to obtain the following representation: 


Ay t = C ( L ) u t 

From which, by applying the algebra illustrated in our discussion of the univariate 
Beveridge-Nelson decomposition, we can derive the following stochastic trends 
representation: 


yt = C* ( L ) u t + C(l)z t 

where z t is a process for which we have Az t = u t . The existence of cointegra¬ 
tion imposes some restrictions on the C matrices, in fact the stochastic trends 
must cancel out when the k stationary linear combinations of the variables in y t 
are considered in other words we must have: 


/3'C (1) =0 

By investigating further the relation between the VECM and the stochastic 
trends representations we can give a more precise parameterisation of the matrix 
c(1) . 

Note first that equation (2.21) can be re-written as : 

y t = ( l m + ct/3') y t -i + u t 

Premultiplying this system by (3' yields: 


(2.32) 
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/3'yt = /3' {I m + a/3') y t _! + /3'u t 
= (4 + ol(3') /3'yt-i + /3'u t 

Solving this model recursively, we obtain the MA representation for the k coin¬ 
tegrating relationships: 


OO 

/3'yt = X] + /3'ut-i (2.33) 

i =0 

By substituting (2.33) in (2.21) we have the MA representation for Ay t : 

OO 

Ay t = (/ fe + cx(3') 1 (3'ut-i+ut 

i =1 

from which we have: 

C (1) = /„ — cc {/3'oc) 1 f3' (2.34) 

Now note the beatiful 3 relation 

In = (3± {ct'±/3 ± ) 1 a' ± +a{/3'a) 1 (3 1 (2.35) 

where /3j_, ai are ((m X (m — k)) matrices of rank m — k such that ol'^ol = 

0,P' ± P = 0. 

By using (2.35) in (2.34) , we have 

c(i) = /4 K/vr 1 ^ 

and 


y t = C* ( L ) u t + (o^/V) 1 («±zt) 

which shows that a system of m variables with k cointegrating relationships 
features (m — k ) linearly independent common trends (TR). The common trends 
are given by (aj_Zt), while the coefficients on these trends are /3j_ (ot'j_f3j_) 1 . Note 
also that stochastic trends depend on a set of initial conditions and on cumulated 
disturbances in fact 

3 See Johansen([34]) ,page 40 
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TRj = TR t _! + C (1) u t 

Our brief discussion should have made clear that the VECM model and the 
MA model are complementary. As a consequence the identification problem rel¬ 
evant for the vector of parameters in the cointegrating vectors (3 is also relevant 
for the vector of parameters determining the stochastic trends ot±. However, 
there is one aspect in which the two concepts are different. In theory, identified 
cointegrating relationships on a given set of variables should be robust to aug¬ 
mentation of the information set by adding new variables, which should have 
a zero coefficient in the cointegrating vectors of the VECM representation of 
the larger information set. This is not true for the stochastic trends. Consider 
the case of augmenting an information set consiting of m variables admitting 
k cointegrating vectors tom + n variables, the number of cointegrating vectors 
is constant while the number of stochastic trends increases by n, moreover an 
unanticipated shock in a small system need not to be unanticipated in a larger 
system. Note that we have added in theory to our statement, this is because, 
in practice, given the size of available samples application of the procedure to 
analyze cointegration in a larger set of variables might lead to identify different 
cointegrating relationships from those obtained on a smaller set of variables. 

2.7.7 VECM and common trends representations 

The joint behaviour of consumption and income under the Permanent Income 
Hypothesis (PIH) is a good empirical example to illustrate VECM and common 
trends representations. Let yt,y p and c t denote respectively the logarithms of 
aggregate disposable income, permanent income and consumption. Under PIH 
the joint distribution of consumption and income can be characterised as follows: 


Vt = y p t + v t 
vl = y v + y p -i+ u t 
c t = y p 

permanent income is the stochastic trend of income, which is made of the per¬ 
manent component and of a transitory component, v t and u t are the shocks to 
the transitory and the permanent component of income, it is natural to think of 
them as orthogonal shocks normally and independently distributed. Consump¬ 
tion and income are cointegrated, in fact they share the single unobservable 
common stochastic trend in this system. 

By eliminating the unobservable stochastic trend from the system, we have 
a bi-variate structural representation: 


y t = c t + v t 

C t = y-y + c t -1 + U t 


(2.36) 
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We obtain the VAR(l) representation by substituting for c t in the first equa¬ 
tion from the second equation of (2.36) 



w t = u t + v t 


From which we immediately obtain the VECM representation 


A y t 
Ac t 


where 
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The derivation of the common trends representation is derived by considering 
that, as y t — c t = v t , the MA representation for consumption and income growth 
is then 



from which we have 

(: M :;) t+o ' (i) 

and 
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Given that in this application 1 = 1, it follows that consumption and 

income have a single common stochastic trend. Such trend can be represented 


as ct 
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^ )t+ 
Vy 


i =1 

v£“v/ 


, and only shocks to the permanent component 


of income enter the trend. 
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2.8 Multivariate cointegration: an application to US data 

To illustrate empirically how cointegration analysis is performed let us con¬ 
sider monthly data from the US economy for the variables considered in ba¬ 
sic macroeconomic models: the log of the real M2 (m — p) ,annual, seasonally 
adjusted, CPI inflation (w) , the log of monthly real GDP (y) , the nominal 
own return on M2 (i? m ) , the nominal opportunity cost of holding money as 
measured by the interest rate on three-month Treasury Bills ( R b ) . All series 
except R m are those used in Leeper-Sims-Zha([38]) , R m has been retrieved by 
the St.Louis FED Website at http://www.stls.frb.org/fred/. They are available in 
the file LSZUSA.XLS. We shall perform cointegration analysis using the package 
PC-FIML by Doornik and Hendry ([10]), alternative menu-driven packages are 
available in RATS (see [41], [21]) , E-Views does not allow to perform all the 
steps of the analysis in that specification and testing of the long-run restrictions 
are not (yet) available. 

2.8.1 Specification of the VAR 

The first step of the empirical analysis is the specification of the VAR. The 
specification of the VAR requires the consideration of two issues pertaining re¬ 
spectively to the set of variables included in the VAR and to the lag length 
of the VAR. These are important issues in that mis-specification of the VAR 
leads to misleading inference. In general the set of variables to be included in 
the VAR is determined by the economic problem at hand, however this criterion 
does not rule out the possibility of mis-specification. Consider the case of the 
set of variables chosen for our example, they include all the variables used in a 
simple IS-LM model of a closed economy, but nothing guarantees that the US 
economy is correctly described by such model. Suppose that the central bank 
targets expected inflation by using short term interest rates as an instrument. 
The model is mis-specified if it omits any leading indicator for inflation mon¬ 
itored by Central Bank. An obvious candidate is the commodity price index 
but there might be more, such as long-term interest rates or other asset prices. 
In absence of an obvious baseline model, the behaviour of residuals is taken as 
an indicator of mis-specification. In a correctly specified model residuals should 
be random normal variables with zero mean and constant variance-covariance 
matrix, departure of fitted residuals from those hypotheses could be taken as an 
indicator of mis-specification. However, even when all the relevant variables have 
been included, the model could be still mis-specified because of omitted relevant 
dynamics. The selection of the order of the VAR is an important step in the 
specification. Sims([52]) suggests a statistics to test the validity of restrictions 
imposed on a general model: 


{T -k) [log | E r | - log |E u „ r |] 

where T is the sample size, k is the number of parameters estimated in each 
equation of the VAR, |E r | is the determinant of the variance-covariance ma- 
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trix of the residuals of the fitted restricted model, |E u „ r | is the determinant of 
the variance-covariance matrix of the residuals of the fitted general unrestricted 
model. The statistic has a y 2 distribution with degrees of freedom equal to the 
number of restrictions in the system. The term (T — k) includes a small sample 
correction, in fact as T becomes larger the correction for small sample (T — k) /T 
converges towards unity. Obviously, the selection of variables and the selection of 
the lag length are not independent processes, in fact a longer lag length might be 
the consequence of omission of one relevant variable from the VAR. In practice 
we shall start from a baseline VAR including the set of variables suggested by 
the theory and a sufficiently long lag, check the behaviour of residuals. When 
well-behaved residuals are obtained, we proceed to reduction of the lag length 
by testing the validity of the implied restrictions. 

Our general baseline model is a VAR estimated over the sample 1960:1-1979:6, 
including fifteen lags of each of the five variables, a constant and a trend, so we 
have: 


Vt \ 


/ Vt \ 


TTt 

14 

TTt 


m — p) t 

= 3. 0 -b a.i t -b 'y ^AjV 

(m - p) t 

+ U t 

' RT 

i= 1 

R? 


Rb t ) 


\ Rb t ) 



We have chosen to end our estimation in 1979 because, from the second part 
of 1979 to 1982, the Fed has changed is operating procedure moving from an 
interest rate targeting regime to a reserves targeting regime. As a consequences 
paramaters in the Fed’s reaction function must have changed. It is very impor¬ 
tant to estimate cointegrating models using data from a single regime. In fact, 
structural instability might be very dangerous when cointegrated models are 
used. The intuition is very simple, in the presence of parameters instability a co¬ 
integrated model is very likely to push the system towards the ’’wrong” long-term 
equilibria with very serious consequences for forecasting and policy simulation. 
Checking residual behavious is very important also with this respect, as patho¬ 
logic behaviour of residuals is a clear symptom of parameters’ instability. 

The estimation of our base-line model delivers the set of residuals reported 
in Figure 2.12. 

The residual are normalized, hence residual with absolute value higher than 
1.96 occur with a probability of one percent under the null of normality. We 
note many outliers, in fact when a formal test of normality of residuals is per¬ 
formed the null is strongly rejected 4 . This is worrying in that non-normality 
might signal mis-specification but also in that departure from normality might 
induce misleading inference in the application of the Johansen procedure. In¬ 
terestingly, most outliers occur on occasion of the oil price crises. So, prob- 


4 We shall discuss tests for normality at a later stage of the book 
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FlG. 2.12. VAR residuals with outliers 


ably a commodity price index is the relevant omitted variables causing non¬ 
normality in the residuals. However, the inclusion of a commodity price index 
as a further endogenous variable in our system would simply shift the out¬ 
lier problem from our equation for interest rates to the commodity price in¬ 
dex. In fact no variable included in this system has an high explanatory po¬ 
tential for a commodity price index. We have then included in the system 
contemporaneous and lagged (up to the sixth lag) commodity price inflation. 
We consider commodity price inflation as a stationary exogenous variable, this 
choice shall be discussed later on. We have also included dummies for excep¬ 
tional periods during the oil price crises as exogenous variables in our sys¬ 
tem. In general, dumMMYY is a variable taking value 1 in the MM month 
of the year YY and zero anywhere else. We include the following dummy vari¬ 
ables: dum7306,dum7307, dum7308, dum7310, dum7311, dum7312, dum7402, 
dum7403, dum7407, dum7408,dum7409, dum7501, dum7505, dum7806, dum7808, 
dum7811, dum7904. Note that the inclusion of dummies and exogenous variables 
in the specification modifies the deterministic nucleus of our model and appro¬ 
priate critical values for the tests should be re-computed (see? ?,??). We do not 
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take this step and report the critical values automatically indicated by version 9 
of Pc-Fiml in Table 7. 

The inclusion of the dummies in the system delivers a new set of residuals, 
reported in Figure 2.13, which are virtually free from outliers and do not show 
any departure from the hypothesis of normality. 



FlG. 2.13. VAR residuals without outliers 

We proceed then to assess the possibility of simplification of the system. The 
progressive simplification strategy, based on the likelihood ratio tests discussed 
above, leads us to a specification with 6 lags. 

2.8.2 Selection of the deterministic components in the VECM specification 

The choice of the determistic components in the VAR is not trivial, given that it 
affects the distribution of the relevant statistics to perform cointegration analysis. 
Given the following general VECM model: 


Ay t — Mo + hi t + IfxAyt^x +HiAy ( _g + ... + IIy ( _ n + u t 
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five possible specifications for the deterministic components have been con¬ 
sidered in the literature: 

(i) /i 0 = 0 , i^l 1 = 0 this would determine a zero mean in the 1 ( 0 ) components 
and a non-zero mean in the 1 ( 1 ) components 

(ii) /i 0 = af3 1 , i^l 1 = 0 this would restrict the constant to belong to the 
cointegrating space inducing a non-zero mean both in the 1 ( 0 ) and the 1 ( 1 ) 
components 

(Hi) jj, 0 = unrestricted , jj, 1 = 0 this would generate a zero mean in the 1(0) 
components and a linear trend in the 1 ( 1 ) component 

( iv ) /i 0 = unrestricted , ji x = af3 1 this would generate a linear trend both in 
the 1 ( 0 ) and the 1 ( 1 ) components 

( v ) /i 0 = unrestricted , ji x = unrestricted , this would generate a linear trend 
in the 1 ( 0 ) components and a quadratic trend in the 1 ( 1 ) components 

Different critical values have been tabulated for each specification ([?]) and 
are now automatically available in all packages peforming the Johansen proce¬ 
dure. Note that the inclusion of intervention dummies also modifies the determin¬ 
istic components of the VAR and this requires in principle in ad-hoc tabulation 
of the relevant critical values([37]). 

In our application we choose specification (iv) as some of our series show 
trends in levels and, as already stated, we ignore the modification of the relevant 
distribution generated by the inclusion of dummies. 

2.8.3 Test for the rank o/II 

Having specified the VAR and chosen the specification of th deterministic com¬ 
ponent , we can estimate the n matrix and start our analysis of the long-run 
properties of the system. We apply the Johansen procedure to identify the rank 
of the matrix n in the following re-parameterisation of our model: 
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The results of the Johansen procedure are reported in Table 6 : 
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TABLE 6 : Analysis of the II matrix in the estimated VAR model 


Eigenvalue 

Ho :rank = p 

Eigenval 

C. Eigenval 

95% 

Trace 

C. Trace 

95% 

0.185 

p = 0 

45.74 

39.61 

37.5 

134.1 

116.1 

87.3 

0.148 

p < 1 

35.93 

31.11 

31.5 

88.36 

76.53 

63 

0.133 

p < 2 

32.14 

27.84 

25.5 

52.44 

45.41 

42.4 

0.083 

p < 3 

19.56 

16.94 

19 

20.3 

17.58 

25.3 

0.0033 

p < 4 

0.74 

0.64 

12.3 

0.74 

0.64 

12.3 


where Eigenvalis the max eigenvalue test, C.Eigenvalue is the max eigenvalue 
test corrected for small sample, i.e. using T — k instead of T, Trace is the trace 
test, C.Trace is the trace test corrected for small sample amd 95% are the critical 
values tabulated for our specification of the deterministic components. Table 7 
poses an interesting problem to the applied reseracher in that the trace statistics 
and the maximum eigenvalue statistic deliver different results, with more relevant 
differences in the case of the adoption of small sample correction for the statistics. 
We opt for rejecting the null of at most one cointegating vector and do not reject 
the null of at most two. Of course, such choice is debatable. 

Note that, before any identifying restrictions are introduced, most available 
cointegrating packages do deliver some point estimates of the a and (3 matrices 
as follows: 


TABLE 7: Cointegrating vectors: the Johansen interpretation. 


Standardised f3' 


V 

7T 

m — p R m R b 

Trend 

1 

0.078 

-0.40 3.55 -3.96 

-0.18 

1.08 

1 

-0.61 -1.20 -1.08 

Standardised a 

-0.15 

V 

-0.02 

-0.013 


IT 

0.02 

-0.005 


m — p 

0.047 

-0.018 


R m 

-0.00008 

-0.002 


R b 

0.03 

0.01 



These estimates are obtained by imposing a default identification which de¬ 
livers cointegrating vectors orthogonal to each other ([36]). In some context, for 
example a demand and supply system, this assumption might be the economic 
case of interest. However, this is not the case in general and in our specific exam¬ 
ple. In the next section we shall evaluate the potential of different identification 
of economic interest by checking the validity of over-identifying restrictions. 
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2.8.4 Specification and testing of the long-run restrictions 

We consider two different proposals. In the first one we identify a traditional 
money demand and a relation beween the own interest rate on money and the 
opportunity cost of holding money. As an alternative, we identify an interest rate 
reaction funcion in which the nominal interest rate responds to inflation, output 
and a linear trend, alongwith a relation between interest rates and inflation. We 
have selected these two specifications because, as we shall see, they form the 
basis for two alternative targeting strategies: inflation targeting via the control 
of money growth and inflation targeting via the control of interest rates. 

We can parameterise the restrictions implied the first identification scheme 
as follows: 
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The results, reported in Table 9, show that the two over-identifying restric¬ 
tions are not rejected. The first cointegrating vector is consistent with a money 
demand function as far as the semi-elasticities with respect to interest rates are 
concerned, the elasticity with respect to income is somewhat high, although it is 
compensated by a deterministic trend with the opposite sign. However, looking 
at the weights on the cointegrating vectors we note that real money reacts very 
little to disequilibrium in the first cointegrating relationship. In fact the only 
strongly significant weight is the one describing the reaction of real income to 
disequilibrium in the second cointegrating relationship. 
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TABLE 9: A scheme of overidentified cointegrating vectors 


Standardised f3' 

y 

7T 

m — p 

R m R b 

Trend 

-2.20 

0 

i 

-7.29 7.51 

0.38 

(0.17) 



(2.16) (0.96) 

(0.06) 

0 

1.08 

0 

-3.16 1 

0 


(0.22) 


(0.59) 




Standardised a 


V 


0.064 

-0.17 




(0.015) 

(0.036) 


7T 


-0.0016 ■ 

-0.034 




(0.009) 

(0.021) 


m — p 


-0.019 ■ 

-0.014 




(0.009) 

(0.023) 


R m 


-0.0006 

0.002 




(0.0001) 

(0.003) 


R b 


-0.023 

0.03 




(0.008) 

(0.02) 



LR-test, rank 

=2:^(2) = 1.03 [0.59] 



We then consider the second alternative and parameterise restrictions as fol¬ 
lows: 
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The results, reported in Table 9, show the plausibility of the interpretation 
of the first cointegrating relation as a reaction function for the monetary policy 
maker. Policy rates react to inflation, with a coefficient which can be restricted 
to one, and to deviation of output form a deterministic trend (a non-stationary 
variable in our specification). The estimated weights strongly support the iden¬ 
tification of this relationship as an equilibrium for the policy rates. The second 
cointegrating vector does not differ significantly from the one obtained within 
the first identification scheme. 
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TABLE 9: A scheme of overidentified cointegrating vectors 


Standardised f3' 

y 

7T 

m — p 

R m 

R b 

Trend 

- 0.22 

-l 

0 

0 

1 

0.08 

(0.03) 



(0.68) 

(0.009) 

0 

-0.96 

0 

2.75 

1 

0 


(0.17) 


(0.47) 





Standardised 

a 


y 


0.13213 

- 0.002 




(0.069) 

(0.039) 



7T 


0.047 

-0.017 





(0.038) 

(0.022) 



m — p 


-0.09 

-0.09 




(0.042) 

(0.034) 



R m 


0.007 

0.003 





(0.005) 

(0.0029) 



R b 


-0.13 

-0.062 





(0.036) 

(0.021) 




LR-test, rank 

=2: X*(4) 

= 6.1 [0.19] 



We conclude our analysis of these two alternative identification schemes by 
stressing that statistical criteria do not lead to an unequivocal identification, 
then the choice between the two alternatives is very likely to rely on economic 
criteria. 

2.9 Multivariate decompositions: some considerations 

The purpose of our illustration of the Johansen procedure in the previous section 
was to show that the identification of cointegrating vectors requires involves a 
multi-step process. The outcome of many of these steps is not so clear-cut and 
therefore the final product might be differ across researchers. The presence of 
structural breaks paired with the specification issues and size of available sam¬ 
ples have an important impact on the empirical application of the procedure. 
Alternative methods to the this procedure have been proposed in the literature, 
see, for example, Horvath and Watson([28]), Phillips ([45]) , Reimers([47]) and 
Saikkonen([49]). However, it is important to remember that the specification of 
a dynamic model in levels has proved sufficient to remove the spurious regres¬ 
sion problem and that the VECM representation of a VAR model in level is 
just a reparameterisation, before the rank reduction restrictions are imposed. 
Sims,Stock and Watson([53]) , argue that a VAR model in levels in the pres¬ 
ence of cointegration is over-parameterised and therefore leads to inefficient but 
consistent estimates of the parameters of interest. The loss of efficiency has to 
be weighted against the risk of inconsistency of estimates which occurs when 
the ’’wrong” cointegrating restrictions are imposed. Imposing the ’’wrong” coin¬ 
tegrating parameters will make the system converge to the ’’wrong” long-run 
equilibria but it will also bias the short-run dynamics as the system is pulled in 
the wrong direction. For this reason the recent research has taken a defined line 
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and VAR in levels rather than co-integrated VARs are used when the issue of 
economic interest is not related to the short-run rather than to the long-run. A 
standard example is the analysis of the monetary transmission mechanism. Of 
course there is more in Sims,Stock and Watson([53]) than these considerations. 
In fact they show that standard distribution can be applied when doing inference 
in a VAR model which involves variables admitting stationary linear combina¬ 
tions, reverting to non-standard distributions is necessary only when the subsets 
of variables on which inference is performed do not admit any stationary linear 
combination. 
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As a matter of fact cointegrated VARs are mainly data-driven specifications. 
The macro-model for the relevant DGP is not fully specified, as it is clearly 
the case with the example discussed in this chapter where we started off our 
investigation by a model centered on money and we end up specifying a long-run 
structure where the quantity of money, being fully demand determined, plays 
no role in the monetary transmission mechanism. It is not easy to interpret the 
results from a simultaneous model, when we have (loose) theories generating only 
a subset of the equations. Moreover, there are difficulties with an approach aimed 
discriminating between theories on the basis of the outcome of test statistics, 
based on a number of joint hypothesis, some of which are clearly independent 
from the theories tested. There is also an issue with the critical values for the 
testing procedures in the Johansen framework. First, they depend crucially on 
the specification of the deterministic nucleus of the VAR , so the inclusion of 
dummies for outliers introduces modifications in the relevant critical values. A 
solution to this problem is available, see Johansen-Nielsen[37]. Second, recent 
work by Johansen [35], has shown that it is important to implement small sample 
corrections for the asymptotic critical values, when applicable. Taking these two 
aspects together, it is likely that a re-assessment of all the empirical evidence 
proposed in the nineties without implementing the appropriate corrections is 
necessary. So what do we make of all the sentences issued on theories using the 
wrong critical values? 

Note also that cointegration analysis based on a multi-step framework: speci¬ 
fication of the VAR and its deterministic component, identification of the number 
of cointegrating vectors, identification of the parameters in cointegrating vec¬ 
tors, tests on the speed of adjustment with respect to disequilibria. The results 
of the final test depend on the outcome of the previous stages in the empirical 
analysis, but the outcome of each step is not so easily and uniquely established 
empirically. 

Of course there is something to be said for a methodology aimed at exploit¬ 
ing cointegration to deliver a stationary representation of a non-stationary vector 
autoregressive process in which short-run and long-run dynamics are naturally 
separated and sound statistical inference can be applied. However, the prac¬ 
tical implementation of such methodology requires theresearcher to deal with 
specification and identification problems which are not easily, and above all not 
uniquely, solved. 
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3 


THE IDENTIFICATION PROBLEM IN 
MACROECONOMETRICS 


3.1 Introduction 

VAR in levels and VECM representations specify the probabilistic structure of 
the data. Consider the case of an empirical investigation of the monetary trans¬ 
mission mechanism gear to evaluate the impact of monetary policy on macroe¬ 
conomic variables, and partition the vector of n variables of interest into two 
subsets: Y,which represesents the vector of macroeconomic variables of interest 
and M, the vector of monetary variables determined by the interaction between 
the monetary policy maker and the economy. As we have seen that the VECM is 
obtained by imposing restrictions on the VAR, consider the general unrestricted 
system: 



This system specifies the statistical distribution for the vector of variables of 
interest conditional upon the information set available at t — 1. In the case of a 
VECM specification, after the solution of the identification problems of cointe¬ 
grating vectors, the information set available at t — 1 contains n lagged endoge¬ 
nous variables and r cointegrating vectors. We face an identification problem in 
that there is more than one structure of economic interest which can give rise to 
the same statistical model for our vector of variables. 

In fact for any given structure, 



(0,1) 


( 3 . 3 ) 



100 


THE IDENTIFICATION PROBLEM IN MACROECONOMETRICS 


which give rise to the observed reduced form (3.1) when the following restrictions 
are satisfied: 

A _ 1 C 1 (L) = Di (L), A = B ("I'J 

there exists a whole class of structures which give rise to the same statistical 
model (3.1) under the same class of restrictions : 

FA (m.)= FC 'W (m,: 1 ,)+ fb ($) < 3 - 4 > 

where F is an admissible matrix in the sense it is conformable by product 
with A, Ci (L), B and F A, FCi(L), F~B feature the same restriction with A, 
Ci (L), B. 

3.1.1 Identifiability 

A model is identifiable if all its possible structure are identifiable, i.e. each struc¬ 
ture is associated to a different distribution, this happens when the only admis¬ 
sible F matrix is identity. 

Let us show the point by considering identification of the first equation. In 
order to achieve identification some restrictions must be imposed, as the number 
of parameters in the reduced form (3.1) is smaller than the number of param¬ 
eters in the structure (3.2). For the sake of exposition we begin by considering 
zero restrictions on the matrices A and Ci, determining the first moment of the 
conditional distributions of the vector of variables of interest and concentrate on 
a first order autoregressive representation: 



Zero restrictions on the first equations can be represented as follows: 


1 

1, a[ 0 


' c[ O' 

n — 1 

. A i a F 

5 

C{ C x 


so (n — rii) endogenous variables and (n — Ay) exogenous variables are re¬ 
stricted to zero. Al is a ((n — 1) X n\) matrix containing the coefficients with 
which the ^variables entering contemporaneously the first equation enters the 
remaining n— 1 equations in the system, A\ is a ((n — 1) (n — rq)) matrix contain¬ 
ing the coefficients with which the n — nivariables entering contemporaneously 
the first equation enters the remaining n — 1 equations in the system. Analo¬ 
gously, Cl is a ((n — 1) X Ay) matrix containing the coefficients with which the 
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/q variables entering with a lag in the first equation enters the remaining n — 1 
equations in the system, C\ is a ((n — 1) (n — /q)) matrix containing the coeffi¬ 
cients with which the n — rq variables not entering with a lag in the first equation 
enters as lagged variables the remaining n— 1 equations in the system. Represent 
the first row of F as (1 /') . Admissibility implies 


/' (A, | Ci) = 0 

in fact only when these conditions are satisfied the first row of F A, FC\ 
feature the same restriction with A, Ci. 

Identification implies that the only solution is // = 0, as the first rows of an 
(n X n) identity matrix has unity as the first element and zeroes as the remaining 
(n — 1) elements. Therefore the condition for identification is 


rank (Ai | Ci) = n — 1. 

This is a necessary and sufficient condition for identification. As ( A\ \ B\) 
has (n — 1) rows and (n — rq) + (n — /q) columns, a necessary condition for 
identification is that the number of colums is sufficiently large to let the rank be 
equal to n — 1 : 


n — ki > rq — 1. 


Therefore, in order for the first equation to be identified, we need that the 
number of omitted lagged variables must be greater than the number of included 
contemporaneous variables minus one (the one variable with respect to which the 
equation is normalized). At this point it is natural to state that the mode is not 
identified when n — /q < rq — l,the model is just-identified when n — /q = rq — 1, 
the model is over-identified, with n+ 1 — ( rq + /q) > over-idenifying restrictions, 
when n — /q > rq — 1. 

This discussion of identifiability can be generalised to non-zero restrictions 
by considering the following representation: 


( Y t \ 

Y t -! 

\Mt-r / 




Wt, (A | -CO = D 



the system becomes 
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Dw, = e t 


(3.6) 



General restriction on the i -th equation can be represented as: 


RiDi = 0, 

where Rj is the (/q X (n + n)) matrix imposing ki restrictions on the 2 n 
elements of i -th equation in the system. 

Necessary and sufficient condition for identification are then 


rank Rj (Di | D 2 ... | D„) = n — 1 

which shows the equivalences between the conditions for short-run identifi¬ 
cation and the conditions required to achieve long-run identification of cointe¬ 
grating parameters. We end this section by noting the short-run and long-run 
identification in a VECM are two completely separated problems ([16]). Consider 
the simplest VECM representation of a first order VAR: 


Ay t = (A x — I) yt-i + u t 
= Ilyt-i + u t . 


When the long-run identification problem is solved we decompose II into a(3’ 
and we can re-write the VECM as follows: 


Ay t = ccz t _i + u t 
zt -1 = /3'yt-r 

which makes clear that the identification of parameters in the structural form: 

AAy t = Accz t _i + Au t 

is independent from the identification of parameters in the matrix j3. 
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3.2 Identification in the ’’Cowles Commission” approach 

The traditional, usually referred to as the Cowles Commission, approach to 
econometric modelling of the monetary transmission mechanism is aimed at the 
quantitative evaluation of the effects of modification in the exogenous variables 
in the system on the endogenous variables in the system. Variables controlled 
by the monetary policy maker (the instruments of monetary policy) are taken 
as exogenous, while macroeconomic variables, which represents the final goals 
of the policy maker, are taken as endogenous. The policy experiment of interest 
consists usually in modifying such exogenous variables to assess the impact on 
the macroeconomic variables of interest. Leaving aside the deterministic compo¬ 
nent, the Cowles Commission specification modifies the general dynamic model 
of the previous section , as follows: 



where Y represents the vector of macroeconomic variables of interest, while 
M is the vector of monetary variables determined by the interaction between 
the monetary policy maker and the economy and M represents a sub-vector of 
the monetary variables, assumed exogenous because directly and fully controlled 
by the monetary policy maker. The process generating these variables does not 
contain any interaction with the other variables in the system. Ci(L), C 2 (L) are 
polynomials in the lag operator L, taking the general form C i(L) = Co + CiL + 
c 2 L 2 + ...c n L n . 

The general conditions for identification derived in the previous section are 
applicable to the above specification. In fact the consideration of some variables 
as exogenous aids identification in that exogenous variables are treated, from the 
point of view of identification, as the lagged endogenous variables. Considering 
the general conditions for identification 


rank Rj (Di | D 2 ... | D„) = n — 1 

note that the inclusion of exogenous variables increase the columns of the matrix 
Ri (Di | D 2 ... | D„) and therefore the chances for the model to be identified. 

3.2.1 An illustration: identifying the IS-LM-AD-AS model 

Let us consider the simplest possible macroeconomic model for a closed economy 
to illustrate how conditions for identification can be checked. The model consists 
of four equations: 



104 


THE IDENTIFICATION PROBLEM IN MACROECONOMETRICS 


Vt = cn + yt - a 13 (R t - irf) + e lt (3.8) 

7 T t = TT® + a 2 1 (y t -y*t)+ £ 2 1 (3.9) 

m t - p t = C 13 + a 31 y t - a 33 R t + e 3t (3.10) 

TTt = fi^t-l + (1 — (3) 7T( + £ 44 . (3-11) 

Equation (3.10) describes an LM equation, which, for oq = 1, relates the 


nominal interest rate to (the log of) money velocity circulation. Equation (3.8) 
describes an IS curve in close economy and shows immediately that monetary 
policy authority can influence the level of activity only if, by controlling the nom¬ 
inal interest rate, it manages to influence the real interest rate. Equation (3.9) is 
an expectations augmented Phillips curve, according to which actual inflation is 
determined by expected inflation and deviations of output from its potential level 
y*, which we take as a deterministic trend. Equation (3.11) describes the mech¬ 
anism with which expectations are formed. The extreme case of price-stickines 
is obtained by posing (3=1, while the case of rational expectations-perfect price 
flexibility is obtained by posing (3 = 0. 

Note that no equation for money is included in the model . In fact money 
supply is not modelled as it is considered exogenous, i.e. fully controlled by 
the monethary authority. The econometrician’s task is the estimation of the 
unknown parameters to simulate the impact of different path for the exogenous 
variable controlled by the monetary authority. The model uses four equations 
to determine four endogenous variables, 7T t ,7 i(,Rt ed y t , for given values of the 
two exogenous variables y( and m t . The exogeneity status is attributed to y( 
and m t , either because they describe the available technology and demography 
or because they are fully controlled by the policy-maker. 

Consider the extreme case of price-stickiness, given by (3 = 1, and use equa¬ 
tion (3.11) to eliminate expected inflation from the model, the IS-LM-AD-AS 
model can be specified as a special case of our general specification (3.7) : 
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Note that y\ is a deterministic trend and money is exogenous in that its rate 
of growth is fixed to C 41 , the effect of monetary policy on macroeconomic vari¬ 
ables is evaluated by assessing the impact on the system of modifications in this 
parameter. To apply to this specific case our general discussion of identifiability 
consider that, using the representation (3.6) , we have: 
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The restrictions in the first equations are imposed by the following matrix: 
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Ri 


01 0 000000000 

000 100000000 
10 0 010000000 

00 0 000100000 

00 0 000010000 

00-1000001000 
00 0 000000100 

00-1000000010 
00 0 000000001 


The first equation is then over-identified with five (nine-four) over-identifying 
restrictions when the following rank condition is satisfied: 


rank Ri (Di 


D 2 ... | D s ) =4 
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— 0-21 1 
aai_3_ 

a 33 a SS 
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0 0 <221 co ; 2i 0 0 2 0 0 —1 

1 7T- 0 co 3i 0 0 0 0 0 0 

0 1 0 c 0j4 i 0 0 0 1 0 0 

0 0 1 Co ; 5i Co t 52 0 0 0 0 0 


= 4. 


By applying the procedure to all the equations in the system it can be shown 
that the second equation is over-identified with five over-identifying restrictions, 
the third equation is over-identified with three over-identifying restrictions, the 
fourth equation is over-identified with six over-identifying restrictions, the fifth 
equation is over-identified with five over-identifying restrictions. We conclude 
that the model is identified and imposes a total twenty-four over-identifying 
restrictions. 


3.3 The great critiques 

Cowles Commission approach to identification of structural econometric models 
broke down in the seventies when it was discovered that this type of models 

“...did not represent the data, ... did not represent the theory... were ineffective for 
practical purposes of forecasting and policy evaluation...” ([17]). 

Different explanations of the failure of Cowles Commission approach gave 
rise to the different prominent methods of empirical research: the LSE (Lon¬ 
don School of Economics) approach, the VAR approach, and the intertemporal 
optimization-Real Business Cycle approach. We shall discuss and illustrate the 
empirical research strategy of these three alternative approaches by interpreting 
them as different proposals to solve problems observed in the Cowles Commission 
approach. 

The LSE approach was initiated by Denis Sargan but owes its diffusion to 
a number of Sargan’s students and it extremely well described in the book by 
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David Hendry ([13]). This approach to macroeconometric modelling explains 
the ineffectiveness of the Cowles Commission models for practical purposes of 
forecasting and policy in terms of their incapability of representing the data. 
The root of the failure of the traditional approach lies in the little attention 
paid to the statistical model implicit in the estimated structure. Consider our 
simple example of the IS-LM-AD-AS model, the identified structure is estimated 
without checking that the implicit statistical model is an accurate description 
of the data. Spanos ([20]) considers tha case of a simple demand and supply 
models to show how reduced form are ignored in the traditional approach, in 
fact most of the widely used estimators allow to derive numerical values for the 
strucutral parameters without even seeing the statistical models represented by 
the reduced form. There are several possible causes for the inadeguacy of sta¬ 
tistical models implicit in structural econometric models: omission of relevant 
variables, omission of the relevant dynamics for the included variables (note, for 
example, that the estimated money demand in our simple example relation is 
a simple, static equation), invalid assumptions of exogeneity. The LSE solution 
to the specification problem is the theory of reduction. Any econometric model 
is interpreted as a simplified representation of the unobservable Data Generat¬ 
ing Process (DGP). For the representation to be valid or “congruent”, to use 
Hendry’s own terminology, the information lost in moving from the DGP to the 
its representation, given by the adopted specification, must be irrelevant to the 
problem at hand. Adequacy of the statistical model is evaluated by analyzing the 
reduced form. Therefore, the prominence of structural model with respect to re¬ 
duced form representation in the Cowles Commission approach to identification 
and specification is reversed. The LSE approach starts its specification and iden¬ 
tification procedure by specifying a general dynamic reduced form model. The 
congruency of such model cannot be directly assessed against the true DGP, 
which is not observable. However, a series of diagnostic tests are proposed as 
criteria for evaluating the congruency of the baseline model. The general prin¬ 
ciple guiding the application of such criteria is that congruent models should 
feature true random residuals, hence any departure of the vector of residuals 
from a random normal multivariate distribution should be taken as a symptom 
of mis-specification. Once the base-line model has been validated, the reduction 
process begins by simplifying the dynamics and by reducing the dimensionality 
of the model by omitting to include equtions for those variables for which the 
null hypothesis of exogeneity is not rejected. In fact the concept of exogeneity 
is refined within the LSE approach and is broken down in different categories, 
determined by the purpose of the estimation of the econometric model. A further 
stage in the simplification process could be the imposition of the rank reduction 
restrictions in the matrix determining the long-run equilibria of the system and 
the identification of cointegrating vectors. The product of this stage is a statisti¬ 
cal model for the data, possibly discriminating between short-run dynamics and 
long-run equilibria. Only after this validation procedure structural model are 
identified and estimated. Just-identified specification do not require any further 
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testing, as their implicit reduced form does not impose any further restrictions 
on the base-line statistical model. The validity of over-identified specification is 
instead tested by evaluating the validity of the restrictions implicitly imposed on 
the general reduced form. After this last diagnostics for the validity of the reduc¬ 
tion process have been performed, structural models are used for the practical 
purposes of forecasting and policy evaluation. 

If the LSE approach finds its explanation of the failure of Cowles Commis¬ 
sion models in their incapability of representing the data, different approaches, 
initiated by two famous critiques by Lucas ([15]) and Sims ([19]) , relate their 
explanations of the failure to the incapability of Cowles Commission models to 
represent the theory. The general class of theoretical models of reference for 
these two critiques are forward-looking intertemporal optimization models. Lu¬ 
cas attacks the identification scheme proposed by the Cowles Commission by 
pointing out that these model do not take explicitly into account expectations 
and therefore identified parameters within the Cowles Commission approach are 
in fact mixture of “deep parameters” describing preference and technology into 
the economy and expectational parameters which, by their nature, are not stable 
across different policy regimes. The main consequence of such instability across 
different regime is that traditional structural macro-model are useless for the 
purpose of policy simulation. To show the point let us reconsider the case of our 
simple model of the monetary transmission mechanism estimated for simulating 
the impact of different monetary policies on macroeconomic variables. 

Assume the following DGP, in which expected monetary policy matters for 
the determination of macroeconomic variables in the economy: 



A Cowles Commission model is estimated without explicitly including expec¬ 
tations over a sample featuring the following money supply rule: 


Aft+i — clq + M t . (3-14) 

The fitted model will therefore have the following specification: 

(m.) =D i(L) (m,: 1 ,) + d **+ d *w >)+(<)• p- 15 ) 

(3.15) is not the correct model to simulate the impact of any rule different 
from (3.14). Consider the case of : 


Mt +1 — fli + Mt 

The correct model, in terms of observable variables, for simulating the effect 
of the new policy would be: 
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(M,) = ° ■ < L > (Mj-i) + <"•> + (uf 


(3.16) 


and simulation based on (3.15) would give the wrong prediction on the effect 
of monetary policy. 

The Sims ([19]) critique runs parallel to the Lucas critique but concentrate 
on the statust of exogeneity arbitrarily attributed to some variables to achieve 
identification within structural Cowles Commission models. Sims argues that no 
variable can be deemed as exogenous in a world of forward-looking agents whose 
behaviour depends on the solution of an intertemporal optimization model. Again 
with reference to our example, reconsider the case for exogeneity of money sup¬ 
ply. If the monetary authority uses money supply as an instrument to achieve 
given targets for the macroeconomic variables, then it would be very ’’natural” 
for money supply to react not only the output and inflation but also to leading in¬ 
dicators for these variables.Assuming money supply as exogenous, the estimated 
model omits completely a very relevant feedback and looses an important fea¬ 
ture of the data. Moreover, by assuming incorrectly exogeneity, the model might 
induce a spurious statistic efficacy of monetary policy in the determination of 
macroeconomic variables. The endogeneity of money does generate a correlations 
between macroeconomic variables and monetary variables, which, by assuming 
invalidly, money as exogenous could be interpreted as a causal relation running 
from money to the macroeconomic variables. 

We shall devote to some deeper illustration of the different approaches to 
identification in response to the problems with the Cowles Commission approach, 
we shall then devote the rest of the book to the illustration of how such different 
approaches are put to work to constrcut macroeconometric models. 


3.4 Identification in the LSE methodology 

To illustrate the LSE approach to identification re-consider the Cowles Commis¬ 
sion specification of the IS-LM-AS model. The Cowles Commission’s strategy 
direct specifies of an identified structure of interest such as (3.12) . The LSE 
research strategy begins form the specification of a general statistical model, i.e. 
a reduced form. There are no specific rule in the choice of the baseline speci¬ 
fication, the only constraint being that such specification should be sufficently 
general to deliver a congruent representation of the underlying unknown Data 
Generating Process. A possible baseline specification could be the following: 
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Note that this model is much more general than the Cowles Commission 
specification as far as the dynamics of all variables is concerned, moreover no a- 
priori restriction on the nature of the trend is imposed. The first step of the model 
validation procedure consists in the evaluation of the lag truncation: is the chosen 
length of the polynomial in the lag operator (L = 2, in our case) long enough 
to capture the dynamics in the data ? If the answer is yes, then the next step 
of the specification strategy can be taken, to identify the long-run structure and 
re-specify the reduced form as a VECM. As we have already pointed out this step 
can be skipped at the only risk of loss in efficiency. To keep the LSE specification 
more directly comparable with the Cowles Commission IS-LM-AS model, we do 
run this risk and keep the reduced form in levels. The last step of the LSE 
identification strategy is the specification of structural models. Just-identified 
model do not impose any further restriction on validated reduced form, while 
over-identified structure do impose testable restrictions on the reduced form. 
Testing such restrictions is the final model evaluation criteria. We have seen 
that in our specific example we have twenty five over-identifying restrictions. 
The validity of the over-identifying restrictions can be checked by comparing the 
reduced form implicit in the structural model (3.12) with the general reduced 
form (3.17). The reduced form implicit in the structural model (3.12) is found 
by pre-multiplying the model by: 
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note that (3.18) imposes more than twenty-five restrictions on (3.17) , in 
fact it imposes twenty-five over-identifying restrictions on (3.17) in addition to 
those necessary to determine the chosen specification for m t and y t *. The LSE 
methodology finds the roots of the failure of Cowles Commission models in the 
choice of specification rather than general baseline specifications. 


3.5 Identification in the VAR methodology 

We have seen that the LSE methodology has intepreted the failure of the tradi¬ 
tional Cowles Commission approach as the result of a specification strategy lead¬ 
ing to mis-specified and ill-identified model. The LSE methodology however does 
not question the potential of macroeconometric modelling for simulation and 
econometric policy evaluation. The VAR approach share with the LSE approach 
the diagnosis of the problem of Cowles Commission models but also questions the 
potential of macroeconometric modelling for policy simulation and econometric 
policy evaluation. VAR models differ from structural LSE models as to the pur¬ 
pose of their specification and estimation. In the traditional approach the typical 
question asked within a macroeconometric framework is “What is the optimal 
response by the monetary authority to movement in macroeconomic variables in 
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order to achieve given targets for the same variables?”. After the Lucas’ critique 
questions like “How should a Central Bank respond to shocks in macroeconomic 
variables?” are to be answered within the framework of quantitative monetary 
general equilibrium models of the business cycle. So the answer has to be based 
on a theoretical model rather than on an empirical ad-hoc macroeconometric 
model. Within this framework there is a new role for empirical analysis, i.e. to 
provide the evidence on the stylized facts to be included in the theoretical model 
adopted for policy analysis and to decide between competing general equilibrium 
monetary models. The operationalization of this research programme is very well 
described in a recent paper by Christiano, Eichenbaum and Evans ([5]). Three 
are the relevant steps: 

• monetary policy shocks are identified in actual economies; 

• the response of relevant economic variables to monetary shocks is then 
described; 

• the same experiment is then performed in the model economies to compare 
actual and model-based responses as an evaluation tool and a selection 
criterion for theoretical models. 

To pin down more precisely the symmetries and differences between LSE- 
type structural models and VAR models consider again the case of the monetary 
transmission mechanism (MTM). The two type of models have a common struc¬ 
ture which we have represented as follows: 



The main difference between the two approaches lies in the aim for which models 
are estimated. 

Traditional Cowles Commission structural models are designed to identify 
the impact of policy variables on macroeconomic quantities in order to determine 
the value to be assigned to the monetary instruments (M) to achieve a given 
target for the macroeconomic variables (Y), assuming exogeneity of the policy 
variables in M on the ground that these are the instruments controlled by the 
policymaker. Identification in traditional structural models is obtained without 
assuming the orthogonality of structural disturbances: remember that we have 
labelled as structural disturbance in traditional models and LSE models the 
vector e, where 



As we shall see the impact of monetary policy is described by dynamic multi¬ 
pliers, which descibe the response of macroeconomic variables to a modification 
in the exogenous monetary instruments controlled by the policy maker. Dynamic 
multipliers are traditionally computed without separating changes in the mone¬ 
tary variable into the expected and unexpected components. 
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The assumed exogeneity of the monetary variables makes the model invalid 
for policy analysis if monetary policy reacts endogenously to the macroeconomic 
variables. LSE methodology would recognise the problem of the invalid exogene¬ 
ity assumption for M, it would then proceed to the identification of an alterna¬ 
tive enlarged model (presumably such identification will be obtained through the 
imposition on a-priori restrictions on the dynamics of the lagged variables). How¬ 
ever, the new model would be still used for simulation and econometric policy 
evaluation, whenever the appropriate concept of exogeneity (respectively, as we 
shall see, strong-exogeneity and super-exogeneity) where satisfied by the adopted 
specification. 

VAR modelling would reject the Cowles Commision identifying restrictions as 
“incredible” for reason not very different from the ones pinned down by the LSE 
approach, however VAR models of the transmission mechanism are not estimated 
to yield advice on the best monetary policy; they are rather estimated to pro¬ 
vide empirical evidence on the response of macroeconomic variables to monetary 
policy impulses in order to discriminate between alternative theoretical models 
of the economy. Monetary policy actions should be identified using restrictions 
independent from the competing models of the transmission mechanism under 
empirical investigation, taking into account the potential endogeneity of policy 
instruments. 

In a series of papers, Christiano, Eichenbaum and Evans ([3], [4]) apply the 
VAR approach to derive “stylized facts” on the effect of a contractionary policy 
shock, and conclude that plausible models of the monetary transmission mech¬ 
anism should be consistent at least with the following evidence on price, output 
and interest rates: 

(i) the aggregate price level initially responds very little; 

(ii) interest rates initially rise; 

(in) aggregate output initially falls, with a j-shaped response, with a zero 
long-run effect of the monetary impulse. 

Such evidence leads to the dismissal of traditional real business cycle model, 
which are not compatible with the liquidity effect of monetary policy on interest 
rates, and of the Lucas ([14]) model of money, in which the effect of monetary pol¬ 
icy on output depends on ’’price misperceptions”. The evidence seems to be more 
in line with alternative intepretations of the monetary transmission mechanism 
based on sticky prices models (Goodfriend and King [11]), limited participation 
models (Christiano and Eichenbaum [2]) or models with indeterminacy-sunspot 
equilibria (Farmer [9]). 

Having stated the objective of VAR models we are now in the position of 
assessing how the technical opportunities for identification, estimation and sim¬ 
ulation are exploited to analyse the MTM. VAR models concentrate on shocks. 

First the relevant shocks are identified, the response of the system to shocks 
is described by analyzing impulse responses (the propagation mechanism of the 
shocks), forecasting error variance decomposition, and historical decomposition. 
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Following Amisano and Giannini ([1]), we represent the general structural 
model of interest as follows: 


M, 


= C(L) 


Y t _r 

M t _i 


B 


(3.20) 


where Y and M are vectors of macroeconomic (non-policy) variables (e.g. output 
and prices) and variables controlled by the monetary policymaker (e.g. interest 
rates and monetary aggregates containing information on monetary policy ac¬ 
tions) respectively. Matrix A describes the contemporaneous relations among 

the variables and C(L) is a matrix finite-order lag polinomial. v = I 

a vector of structural disturbances to the non-policy and policy variables nor¬ 
mally independently distributed with identity variance-covariance matrix; non¬ 
zero off-diagonal elements of B allow some shocks to affect directly more than 
one endogenous variable in the system. 

The structural model (3.20) is not directly observable, however a VAR can 
be estimated as the reduced form of the underlying structural model : 



A _1 C(L) 


Y t _r 

M t _i 


u 


ir 

M 


(3.21) 


where u denotes the VAR residual vector, normally independently distributed 
with full variance-covariance matrix E. The relation between the VAR residuals 
in u and the structural disturbances in v is therefore: 



undoing the partitioning we have 

u t = A -1 B v t 


(3.22) 


from which we can derive the relation between the variance-covariance matrix 
of u t (observed) and the variance-covariance matrix of (not observed) as 
follows: 


E (u t u() = A~ X BE (v t v' t ) B'A -1 . 

Substituting population moments with sample moments we have: 

E = A _1 BIB'A _1 

E contains n ( n + 1 ') different elements, this is the maximum number of iden¬ 
tifiable parameters in matrices A and B, terefore identifying restrictions are 
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imposed on these matrices. We shall analyze the different type of identifying re¬ 
strictions in one of the next chapters, devoted to VAR models. Once shock have 
been identified, the dynamic properties of the system can be described by ana¬ 
lyzing the response of all variables in the system to such shocks. Note that VAR 
models do not include explicitly expectations and might therefore be subject to 
the Lucas critique. The general defense of VAR modellers against the Lucas cri¬ 
tiques relies upon the fact that the variables shocked are the shocks and therefore 
the estimated parameters are not modified for simulation purposes. 

3.6 Identification in intertemporally optimized models 

The natural outcome of the Lucas critique are intertemporally optimized mod¬ 
els in which deep parameters, independent from a particular policy regime, are 
identified separately from expectational parameters, specific to policy regimes. 
The intertemporal optimisation approach to macroeconomics leads naturally to 
a framework for identification and estimation of the deep parameters of interest. 
In fact the first order condition for the solution of intertemporal optimization 
problems are orthogonality conditions which can be exploited for identification 
and estimation of the structural parameters of interest. To illustrate the point 
consider the simplest possible version of the inflation targeting problem, see 
Svensson ([21]). The central bank faces the following intertemporal optimisation 
problem: 


where: 


OO 

Minimize E t 'y^S l L t +j 
i =o 



7 r *) 2 + Xxl 


(3.23) 


(3.24) 


where E t denotes expectations conditional upon the information set available 
at time t, 6 is the relevant discount factor, L is the loss function of the central 
bank, 7T t is inflation at time t, 7T* is the target level of inflation, x represents 
deviations of output from its natural level, A is a parameter which determines 
the degree of flexibility in inflation targeting. When A = 0 the central bank is 
defined as a strict inflation targeter. As the monetary instrument is the policy 
rate, i t , the structure of the economy must be described to obtain an explicit 
form for the policy rule. We consider the following specification for aggregate 
supply and demand in a closed economy 1 : 


x t +i = P x x t - f3 r (i t - Et-jrt+i - r) + uf +1 (3.25) 


1 As we shall see in one of the next chapter these two function are the outcome of the 
solutions of intertemporal optimisation problems by agents in the private sector. 
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7T t+ i =7r t + a x x t + u s t+1 (3.26) 

Note that macroeconomic variables do not react contemporaneously to the in¬ 
strument of monetary policy, this is a first identifying restriction for the relevant 
parameters in the model. As shown in Svensson ([21]), the first order conditions 
for optimality may be written as follows: 


dL , , A 

— = {E t TT t+ 2 -7r ) = --— -E t x t+1 
di t oa x k 

6Xk 

A + Sa^k 


(3.27) 

(3.28) 


(3.27) are orthogonality conditions involving all the deep parameters describ¬ 
ing the structure of preferences of the central banker, 7T*, <5, A and just one pa¬ 
rameter coming from the structure of the economy, a x . By using (3.9) in (3.8) 
we obtain: 


E t Ti t+2 = E t 7r t+ i + a x [f3 x x t - f3 r (i t - E t n t+1 - r)] (3.29) 

and by substituting (3.29) in (3.27) we derive an interest rate rule: 


it = r + 7T 


1 + a x /3 r 
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oa x k (x x p r 


(3.30) 


The parameters in the interest rate rule (3.30) are convolutions of the pa¬ 
rameters describing central banks preferences and of those describing 

the structure of the economy (a x , f3 r ,f3 x ,r ) . It is then impossible to assess from 
the estimation of the rule if the responses of central banks to output and in¬ 
flation are consistent with the parameters describing the impact of the policy 
instrument on these variables. Note, for example, that the estimation of an in¬ 
terest rate rule relating the policy rate to the output gap and to the deviation of 
expected inflation from target does not help to distinguish a strict inflation tar- 
geter (A = 0, in the terminology of Svensson) , from a flexible inflation targeter 
(A > 0). 

In fact, there is only one empirical implication of the rule which can be 
confronted with the data independently from the identification of the parame¬ 
ters of interest, namely whether the parameter describing the reaction of pol¬ 
icy rates to a gap between expected and target inflation is larger than one. A 
monetary policy which accommodates changes in inflation, < l,will not 
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in general converge to the target rate 7T*. This empirical prediction is the one 
which has attracted most of the discussion on estimated monetary policy rules 
(see Clarida,Gali and Gertler, [ 6 ], [7], [ 8 ]). 

By comparing the first order conditions for optimality, known as Euler equa¬ 
tions, with the explicit interest rate rule we note that the deep parameters of 
interest are much more easily identifiable from (3.27). In fact, while in our specific 
example (3.27) depend mainly on deep parameters describing taste and technol¬ 
ogy, there are macroecnomic applications in which the Euler equations depend 
only on these parameters. The identification and estimation strategy naturally 
consistent with the intertemporal optimization approach, is then to derive first 
the Euler equation and use them to pin down the deep parameters of interest. 
This step can be achieved by applying an estimation method directly based on 
orthogonality conditions, the Generalised Method of Moments. Numerical values 
to the remaining parameters in the model are then attributed, not necessarily 
by estimation. Then models are simulated and evaluated by comparing actual 
data with simulated data. 

In the next chapters of the book we shall consider more deeply all the different 
approaches to macroeconometric modelling by considering a common macroeco¬ 
nomic issue: the analysis of the monetary transmission mechanism. 
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4 


THE COWLES COMMISSION’S APPROACH 

4.1 Introduction 

The traditional, usually referred to the Cowles Commission, approach to econo¬ 
metric modelling of the monetary transmission mechanism is aimed at the quan¬ 
titative evaluation of the effects of modification in the variables controlled by 
the monetary policy maker (the instruments of monetary policy) on the macroe¬ 
conomic variables which represents the final goals of the policy maker. We can 
identify three stages in the traditional approach: 

• specification and identification of the theoretical model; 

• estimation of the relevant parameters, and assessment of the dynamic prop¬ 
erties of the model, with particular emphasys for the long-run properties; 

• simulation of the effects of monetary policies. 

We have already discussed the Cowles Commission approach to specification 
and identification in the previous chapters. 

Before illustrating the approach at work we shall devote sections to the dis¬ 
cussion of estimation, simulation and policy evaluation. 

4.2 Estimation in the Cowles Commission Approach 

The crucial features in the identification-specifcation stage, which are well shown 
by our IS-LM-AD-AS example, is that the specified empirical model is usually 
loosely related to theoretical models and that identification is achieved by im¬ 
posing many a-priori restrictions delivering exogeneity status to a number of 
variables. As a consequence, identification is easily achieved within Cowles Com¬ 
mission models, usually with a large number of over-identifying restrictions. We 
have also seen that criticisms of this approach attributes the roots of its failure 
in the imposition of too many restrictions and in their incapability of recovering 
the structural deep parameters of economic interest, describing preference of the 
agents and the satus of technology. 

However it is interesting to note that traditional modelling was in a sense 
aware of the presence of some mis-specification in the estimated equations. Such 
presence of mis-specification resulted in departure from the conditions which 
warrant that OLS estimators are BLUE. The solution proposed was not re¬ 
specification but rather modification of the estimation techniques. This is well 
reflected in the structure of the traditional textbooks, see for example Gold- 
berger ([5]), Johnston, where the OLS estimator is introduced first and then 
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different estimators are considered as solutions to different pathologies in the 
model residuals. Pathologies are identified as departures from the assumptions 
which guarantee that OLS are BLUE. I think that it is by now very well estab¬ 
lished that correcting the estimator is a strategy clearly inferior to improving the 
specification, i.e. correcting the model. Nevertheless, we devote some space to 
the discussion of alternative estimators in that they could be a last resort, to be 
used when models could not be improved for lack of the necessary informations. 

4.2.1 Heteroscedasticity, autocorrelation and the GLS estimator 

Let us reconsider the single equation model of Chapter 1, to generalize it to 
the case in which the hypotheses of diagonality and Constance of the conditional 
variances-covariance matrix of the residuals do not hold: 


y = X/3 + e (4.1) 

e ~ n.i.d. (0, er 2 ff) 

where the vector y contains T observations on the dependent variables, X 
contains (T X K) observations on the K explanatory variables exogenous for 
the estimation of the vector (K X 1) /3, and fl is a (T X T) symmetric and 
positive definite matrix. When the OLS estimator it is applied to model (4.1) it 
delivers estimators which are consistent but not efficient, moreover the traditional 
formula for the variance-covariance matrix of the OLS estimators, er 2 (X'X) 1 , 
is wrong and it leads to incorrect inference. In fact, by using the standard algebra 
of Chapter 1 it can be shown that the correct formula for the variance-covariance 
matrix of the OLS estimator is: 

er 2 (X'X) 1 X'ffX (X'X) -1 . 

A general solution to this problem is found in general by remembering that 
the inverse of a symmetric definite positive matrix is also symmetric and definite 
positive and that for a given matrix Q, symmetric and definite positive, it always 
exists a (1 X 1) non-singular matrix K such that K'K = fU 1 and KflK'= I t- 

To find how the solution is implemented consider the regression model ob¬ 
tained by pre-multiplying both the right hand side and the left hand side of (4.1) 
by K: 


Ky = KX/3 + Ke (4.2) 

Ke ~ n.i.d. (0, ct 2 It) . 

The OLS estimator of the parameters of transformed model (4.2) satisfies all 
the conditions for the applications of the Gauss-Markov theorem, therefore the 
estimator 
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3gls = (X' K'KX) 1 X'K'Ky 

= (x'o^x^x'irV 

known as the Generalised Least Squares estimator, is BLUE. The variance of 
the GLS estimator, conditional upon X, becomes 

Var ^ P GLS 

Note that, from the applicability of the Gauss-Markov theorem, it follows 
immediately that the variance of the GLS estimator is equal to the sum of the 
variance of any other linear estimator and a positive semi-definite matrix. Con¬ 
sider for example the variance of the OLS and of the GLS estimators. Using the 
fact that if A and B are positive definite and A — B is positive semi-definite, 
then B A 1 is also positive semi-definite, we have: 

(X'Q^X) - (X'X) (X'fiX) 1 (X'X) 

= X'K'KX - (X'X) (X'KU 1 (K') _1 x) * (X'X) 

= X'K' ^1 - (K') _1 X (X'K- 1 (K') _1 x) _1 XKT 1 ^ KX 

= X'K'M^M ff KX 

W = (K') _1 X 

M w = (i - W (W'Wf 1 w') 

The applicability of the GLS estimator requires an empirical specification 
for the matrix K. We consider here two specific applications, where appropriate 
choice of the such matrix leads to fix problems in the OLS estimator generated 
respectively by the presence of first order serial correlation and of heteroscedas- 
ticity in the residuals. 

Consider first the case of first order serial correlation in the residuals, we have 
the following model: 

Vt = x'tP+Ut 
u t = pu t _ 1 +e t 
e t ~ n.i.d. (0, of) 

which, using our general notation, can be re-written as: 
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y = X/3 + e 
e ~ n.i.d. (0, er 2 fl) 


1 p p 2 

P 1 P 

rP 1 


T— 1 
9 

rT—2 


P T - 2 . 

pT -l p T -2 


P 1 P 
• P 1 


(4.3) 

(4.4) 


In this case the knowledge of the parameter p allows the empirical imple¬ 
mentation of the GLS estimator. An intuitive procedure to implement the GLS 
estimator, could then be the following: 

• estimate the vector /3 by OLS and save the vector of residuals u t 

• regress u t on u t -1 to obtain an estimate 'p of p 

• construct the transformed model and regress ( y t — 'pyt- 1 ) on (x t — px t _i) 
to obtain the GLS estimator of the vector of parameters of interest. 

Note that the above procedure, known as the Cochrance-Orcutt procedure, 
could be iterated until convergence. 

In the case of heteroscedasticity our general model becomes 


y = X/3 + e 
e ~ n.i.d. (0, f 1) 


(4.5) 

(4.6) 


o\ 0 0 . 
0 a\ 0 . 


n = 


o 

o 


0 . . 0a 2 T _ 1 o 

0 0 .. 0 of 


In this case, in order to construct the GLS estimator, we need to model 
heteroscedascity choosing appropriately the K matrix. White ([16]) proposes a 
specification based on the consideration that in the case of heteroscedasticity the 
variance-covariance matrix of the OLS estimator takes the form: 
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cr 2 (x'x) _ 1 x'nx (x'xr 1 

which could be used for inference, once an estimator for 0 is available. The 
following unbiased estimator of 0 is proposed 


u\ 00. . 0 
0 u 2 2 0 . . 0 


0 . . 0 u \_ 1 0 

0 0.. 0 u\ 


Alternative models for heteroscedasticity, known as ARCH (Autoregressive 
Conditional Heteroscedasticity) processes, useful for high-frequency financial se¬ 
ries, and based upon simultaneous modelling of the first two moments of time- 
series processes have been proposed by Engle ([4]) and Boilerslev([l]) . 

4.2.2 Endogeneity 

The estimation of simultaneous system needs the solution of a problem, indepen¬ 
dent of mis-specification, which has prompted most of the advances in estimation 
theory within the Cowles Commission approach: simultaneity. 

To discuss simultaneity we consider the following representation of a model 
of interest: 


By t + Tz t = u t (4.7) 

u t ~ n.i.d. (0, E) 

where y t is a (G X 1) vector of endogenous variables, z t is a (M X 1) vector 
of exogenous variables, these variables are considered exogenous in that they 
are orthogonal to residuals. Therefore, in the case of a dynamic specification, 
it contains all contemporaneous variables considered orthogonal to residuals, 
their lags, and the lags of the variables y t . B and T are matrices of parameters, 
respectively (G X G) and (G X M ). Using matrix notation we can represent (4.7) 
alternatively as follows: 


By'+Tz' = u' (4.8) 

where y is a (T X G) matrix, z is a (T X M) matrix and u is a (T X G) matrix. 

To illustrate the problems with the OLS estimator generated by endogeneity 
consider the first equation of model, which we write as 
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yi=xi<5i+ui (4.9) 

where yi is a (T X 1) vector containing all the observations on the first en¬ 
dogenous variable in the model, Xi is a (T X (G*! + M\ — 1)) containing all ob¬ 
servations on the Mi exogenous variables included in the first equation and on 
the G\ — 1 contemporaneous endogenous variables included in the first equation. 
Given that the matrix Xi contains some endogenous variables, in general we 
have: 


plim^x(ui^0 (4.10) 

and the OLS estimator of the parameters of interest is not consistent. Con¬ 
dition (4.10) is immediately understood by referring to the reduced form of the 
system (4.7) 


y t = B- 1 rz t +B- 1 u t (4.11) 

u t ~ n.i.d. (0, E) 

which shows that, with the exception of special configurations for the matrices 
B and E, all endogenous variables are correlated with all residuals in Ui. 

4.2.3 GIVE estimators 

The Generalized Instrumental Variables (GIVE) estimator is derived by con¬ 
sidering that, in the simultaneous model, condition (4.10) holds but we have 
also 


plim— z'ux = 0 (4-12) 

therefore a consistent estimator of the parameters of interest can be derived by 
solving the following system of equations: 

z'ut = 0 (4.13) 

z' (yi - = 0 

System (4.13) contains a number of equations equal to the number of variables 
in Zi ,M, the number of unknowns is equal to the number of parameters in the 
vector <5i, K\ = G\ + Mi — 1. We have then three cases of interest: 

(i) M < Ki : the number of unknowns is larger than the number of equations 
and no estimators of the parameters of interest can be derived. Not surprisingly, 
in fact in this case the equation is not identified. 
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(ii) M = K\ : the number of uknowns is exactly equal to the number of 
equations, the system is just identified and the solution to (4.13) has a unique 
solution and delivers an estimator of the parameters of interest: 

<5r = (z'xi) 1 z'yr 

(iii) M > K\ : the number of equations is larger than the number of un¬ 
knowns, the equation is over-identified and the estimator of parameters of inter¬ 
est is not unequivocally determined by the orthogonality condition (4.13) 

An intuitive solution for the over-identification case is obtained by taking K 
linear combinations of the Mi orthogonality conditions. Define a matrix L of 
dimensions (Ki X M ). Pre-multiplying the system (4.13) by L, we have: 


Lz'u! = 0 (4.14) 

Lz' (yi - xi<5i) = 0 

from (4.14) we derive the following estimator: 

(5i = (Lz'xi) 1 Lz'yr (4.15) 

From (4.15) it follows that 

<5i — <5i = (Lz'xi) 1 Lz'ui (4.16) 

and 

Vt fii - Si) = (^Lz'xr) LVTzju! (4.17) 

Given that consistency of the estimator is guaranteed by the hypothesis 
(4.12) , assuming that 

plim^Lz'xi = LM ZXlj VTz'ui h N (0, OnM zz ) 
we can apply the Cramer’s theorem to conclude that: 


Vf^-8^ ~Jv(0,<7 11 (LM rai )- 1 LM„L'(M I1 ,L') ') (4.18) 

(4.18) characterizes completely the properties of the estimator, however em¬ 
pirical implementation requires the knowledge of the matrix L. Note that the 
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variance of the estimator depends on L, hence a natural criterion for choosing 
this matrix is the maximization of the efficiency of the estimator. Sargan([12]) 
shows that the variance of the estimator is minimized when L = in 

which case we have 

VTpr-^) ~iv(o,CTu(M ai *M^M* ai ) _1 ) (4.19) 

The choice of L defines the following estimator: 

Sx = (x[z (z'z) 1 z'xi^ x(z (z'z) 1 z'yi (4.20) 

whose variance-covariance matrix can be estimated as follows: 

sf ^x(z (z'z) 1 z'xi^ 

s i = f (yi - x i^i) (yi - x i^i) 

(4.14) defines the Genaralized Instrumental Variables Estimator (GIVE). 
It is easy to see that in the case of exact identification GIVE simplifies to 
(z'xi) 1 z'yi. 

An equivalent estimator to GIVE is derived by the following two-step proce¬ 
dure: 

• regress by OLS Xi on z and construct fitted values Xi = z (z'z) 1 z'xi = 
zQi 

• regress by OLS yi on Xi to obtain 

<5i = (x(xi) 1 x(yi 

= ^x(z (z'z) 1 z'xi^ x(z (z'z) 1 z'yi 

Which is known as the Two-Stage Least Squares (TSLS) estimator. Note 
that, in order to obtain a TSLS estimator as efficient as the GIVE estimator, 
it is important to avoid generating an estimate of the variance of the estimator 
using the residuals from the second stage. In fact we have: 


ui^tsls = yi — Xi<5i 

= yi - Xi<5i - (xi - Xi) <5i 

= Ul, GIVE ~ Xi<5i - (X X - X X ) (?! 

which would result in an upward biased estimator of the variance. The prob¬ 
lem is easily solved, and in virtually all econometric packages available there is 
no difference between the TSLS and the GIVE estimators. 
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More interesting problems are generated by mis-specification. In general we 
have mis-specification when the instruments are correlated with residuals. The 
classical form of mis-specification is omitted variables. Consider the following 
case: 

The Data Generating Process can be represented as follows: 


yi = xi<5i +ui 

The following model is estimated: 


yi = xi<5i + vi 

The GIVE-TSLS estimator of <5i is: 

^! = (z'z) 1 z'xi^ x(z (z'z) 1 z'yi 

= <5i + ^x(z (z'z) 1 z'xi^ x(z (z'z) 1 z' (x^ + Ui) 
which is not consistent whenever x) is correlated with z. In this case we have 

plim^z'vi ± 0 

and the instruments in z cannot be considered as valid. 

Sargan ([12]) derived a statistic to test the null hypothesis of validity of 
instruments by showing that the quantity: 

_ u(z (z'z^Vu! 

7 ‘ ( 

s i = f (yi - x i^i) (yi - x i^i) 

is distributed as a y 2 with M — K\ degrees of freedom under the null hy¬ 
pothesis of validity of instruments. 

4.2.4 Three-stage least squares (3SLS) and Seemingly Unrelated Regressions 
(SURE) estimators 

The estimators we have considered so far solve the problems generated by simul¬ 
taneity without reverting to the specification of the full structural model. For this 
reason GIVE-TSLS estimators are known as limited information estimators. To 
analyze full-information estimators we need to introduce some new definitions. 

Far any two matrices A (m X n) and B (p X q) define as the Kronecker product 
A ® B the matrix (mp X nq) obtained by multiplying each element of A by B. 
The following properties are related to the Kronecker product: 
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• (A ® B) (C ® D) = AC ® BD, whenever the matrices AC and BD are 
defined 

• (A ® B)' = A'®B' 

• (A ® B) X = A ® B _1 , whenever the matrices A and B _1 are defined 

Define vec(A) , the vectorization of the A, as the vector (ran X 1) obtained 
by stacking the m transposed rows of A : 


till • • ^lr 


A 

m x n 


\ ti' n 1 



/ 


( «hr \ 


vec (A) 


din 

d21 


d2n 


\ ^nn / 

Vectorization and Kronecker product are linked by the following property: 


vec (ABC) = (A® C)W(B) 

To discuss full-information estimation, consider that the i-th equations of our 
model can be represented as 


y i = + Ui 

where yj is a (T X 1) vector containing T observations on i-th endogenous 
variables, Xj is a (T X Ki) matrix, with Ki = (Gj + Mi — 1) containing all ob¬ 
servations on the Gj — 1 endogenous and on the Mi exogenous variables included 
in the i-th equation. We can give the following compact representation of the 
model: 
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y + = x + <5 + + u+ 


(4.21) 


u+ is a (GT X 1) vector: 


where y + is a (GT x 1) vector, x + is a ( GT x ) matrix, 8 + is a ( TKi x 


i= 1 


i= 1 


y + = 


/ yi \ /x x 0 0. 0 \ 

+ _ 0 x 2 . . 

0 . . . 

\yc J V 0 • • x g/ 

^ <*1 \ / U ! > 

<5 + = ’ , u+ = 

\ S g) \u g ) 

The following properties hold for u + 


E(u+) = 0 


E (u + u +/ 


/ E( uiu() E (uiu 2 ) 
E (u 2 u() A(u 2 u 2 ) 


\E( u G u() 


0 u i u g) \ 


E(u g u' g )J 


where each block of the above matrix is (T X T) . 

Assuming that all residuals are contemporaneously correlated but not serially 
correlated, with non singular variance-covariance matrix E, 1 we have: 


£ 0 It 


( <7n It cTi2-fr • 

• <7 IgIt ^ 

<721 It <7 22^-T • 


\0giIt 

• <7 GgIt / 


The problem to be solved is the estimation of parameters in (4.21) taking 
into account simultaneity and the strucutre of correlations in E. These problems 
are solved in turn by the Three-Stage Least Squares (3SLS) estimator. 


The last assumption requires that all identities are excluded from the model 
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4.2.4.1 First stage: the diagonalization ofH Consider the following decompo¬ 
sition for X -1 

5T 1 = HH' (4.22) 

which always exists. From(4.22) we have 

HXH' = I G 

By pre-multiplying (4.21) by H' ® It, we obtain 

(H' ® I T ) y+ = (H' ® I T ) x+<5 + + (H' ® I T ) u+ (4.23) 

where residuals of (4.23) feature a diagonal variance-covariance matrix: 

E ((H' ® I T ) u + u+' (H' ® J T )') = (H' ® I T ) (X ® I T ) (H' ® I T )' 

= (H'X®/t)(H®/ t ) 

= I G ® It = Igt 

The completion of the first stage has left us with the following transformed 
model: 

(H' ® I T ) y + = (H' ® I T ) x+<5 + + (H' ® I T ) u+ (4.24) 

y* = x*<5* + u* (4-25) 

E (u*) = 0, E (u*u*') = I GT (4.26) 

in which, the variance-covariance matrix is diagonal, but we still have simul¬ 
taneity, in fact 

p lim — x*'u* ^ 0 

4.2.4.2 The second stage: choice of instruments To select instruments, remem¬ 
ber that the reduced form of our original system can be represented as follows: 

y' = B^Tz' + B V (4.27) 

Vectorisation of (4.27) delivers: 
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vec (y') = vec (B 1 rz')+vec(B x u') (4.28) 

from which it follows that 

y+ = vec (/ G B^rz') + vec (B _1 u') (4.29) 

= (Ig ® z) vec (B _1 Tz') + vec (B _1 u') (4.30) 

Then, the natural choice of instruments is (Iq ® z) 

4.2.4.3 The third stage: applying the GIVE principle By applying the GIVE 
principle to (4.25) ,choosing z* = (Iq ® z) as instruments, we have 

^ = (x*'z* (z*'z*) _1 z*'x*) x*'z* (z*'z*) _1 z*'y* 

At this stage, remembering that 

x*'z* = x+' (H ® I T ) (Ig ® z) 

Z*'z* = (Iq ® z') (Iq ® z) = Iq ® z'Z 

z *'y* _ (^J G 0 z 'j (JJ 0 l T ) y+ _ (JJ' 0 Z ') y + 

we can show the following results: 

^x*'z* (z*'z*) 1 z*'x*^ = X + ' (H ® z) (Iq ® z'z) 1 (H'®z')x + 

= x+' ^E _1 ® z (z'z) 1 z'^ x+ 

(x.*'z* (z*'z*) 1 z*'y*^ = x + '(H ® z) (/g ® z'z) 1 (H'®z')y + 

= x+' ^E _1 ® z (z'z) 1 z'^ y + 

and, finally, we have an expression for the 3SLS estimator 

<5 X = ^x + '® z (z'z) 1 z'^ x + ^j x + '^X® 1 ® z (z'z) 1 z'^ y + 

The asymptotic distribution of the 3SLS can be written as follows: 
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8 X ^7V^<5 + ,^x + ' ^E 1 ® z (z'z) 1 z ,S j 

To make the estimator operational an estimate of E is needed. This can be 
obtained by using the sample correlations of the residuals from 2SLS estimation. 

To analyze the estimator more closely we can re-write in a more extensive 
format. In fact, we have 


x 


+/ 



X 


+ 


/xj 0 0 0 \ 

o x 2 ■ • 

0 . . . 

V 0 • • x 'g J 


( OnZ (z'z) 1 z' 
\ 0 giZ (z'z) —1 z' 


On Z (z'z) 1 z' \ 


ogg z(z'z) 1 z’J 


/xj 0 0 0 \ 
0 x 2 . . 

0 . . . 


V 0 . . X G J 


X 


+/ 



y 


+ 


I X>U x i z (z'z) 1 z'yj \ 


^ E CT Gjx(jZ (z'z) 1 z' yj J 

where Oij represents the generic element i,j of the matrix E _1 . We are now 
in the position of considering some specific cases of the 3SLS estimator. 

Note first that the 3SLS estimator coincides with the 2SLS when the matrix 
E is diagonal. In this case we have: 


(x + ' ^E 1 ® z (z'z) 1 z'^ x + ^j x+' ^E 1 ® z (z'z) 1 z'^ y + 

^ ^x(z (z'z) x z'xi^ x(z (z'z) 1 z'yi ^ 

^ (x^z (z'z) -1 z'x G ) x(jZ (z'z) -1 z'y G y 

This equivalence result holds also when all the equations in the system are 
exactly identified. 


- + 

8 = 
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Another interesting case arises when the matrix B in our structural model 
(4.7) is diagonal, when we have: 


y + 

= x + <5 + +u + 

(4.31) 

U(u+) 

= 0 

(4.32) 

E (u + u +/ ) 

= £ (x) 

(4.33) 

lim — x +, u + 

T 

= 0 

(4.34) 


The particular structure of B implies that all the simultaneity in the system 
comes from the correlation of residuals, therefore after the implemetation of the 
first stage of the 3SLS, the diagonalization of the variance-covariance matrix, a 
consistent estimator is derived by applying OLS to the transformed model. The 
relevant estimator is then: 

S = (x*'x*) _1 x*'y* 

(x +/ (5T 1 ®I T )x + )~ 1 x+' (5T 1 ®I T )y + 

which is known as the Seemingly Unrelated Regression Equations (SURE) 
or Zellner’s estimator. 

A further interesting specific case of the SURE estimator is obtained when 
each equation of the system contains the same set of regressors: 



<5+= ((/®x)' (X 1 ®I t )(/®x)) 1 (/®x)'(E 1 ®I T )y + 

= (/XT 1 / ® x'/yx) 1 (/XT 1 ® x'It) y + 

= (X®(x'x) _1 ) (X- 1 ®x')y+ 

= (/ G ® (x'x)~ 1 x')y+ 

which gives a compact representation of the OLS estimators applied equation 
by equation. 
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4.2.5 FIML estimator 

Lastly, we give a brief description of the most general full-information esimator: 
the Full Information Maximum Likelihood ( FIML ) estimator. Considering the 
reduced form of our model (4.11) and taking logarithms, we can write the joint 
distribution of yi, ...,y t, as follows: 


—GT T 

logC = —^— log (27 t) — — log |B _1 SB ,_1 1 + (4.35) 

(y t - B-^Xt)' B'S^B (y t - B^Tx*) 

^ t = 1 

Note now that 

(y t -B- 1 rx t )'B' = (By t -rx t )' 

and that, from a standard results on determinants, it follows that: 

-lloglB-^B'- 1 ! =T|logB|-|log|S| 

So we can re-write our log-likelihood function as: 


—GT T 

log L = —-— log (27 t) + T |log B| - — log |S| + (4.36) 

1 T 

--^(By.-TxO'S-^By.-rxt) 

z t=l 

FIML estimator are derived by maximizing (4.36) with respect to B,r,E. A 
number of technical issue arise as the problem is non-linear. For a good discussion 
of these problems, and solutions, see Hendry ([7]). The FIML estimator is the 
most general system estimator in that all other estimators can be derived as its 
special cases, for a detailed derivation see Hendry([6]) . 

4.3 Simulation 

Having identified the model and estimated the parameters of interest it is possible 
to proceed to the simulation. For given values of the parameters and of the 
exogenous variables, values for the endogenous variables are found by finding 
the dynamic solution of the model. To illustrate how this result is accomplished, 
consider the following general representation of a model including n endogenous 
variables y={y\ y2—yn) and in k exogenous variables x=(xi x 2 ...Xk) ,: 
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Hit = fi {y u , ■■■y n t,Ai{L)y t -i,^t,B 1 (L) x t _i) 


V2t = h {yit,...y„t,A 2 {L)yt-i,Xt,B 2 (L) x t _ i) 


yu = Ik {yit, ...y n t,A k (L)y t - 1 ,x t ,B k ( L ) x t _i) 


ynt — /n (yit: — ynt : (A)y£— 1: (A) l) 

In the specifications discussed so far the functions fi are linear, but more 
general specification could be accomodated within this framework. 

Solving the model amounts to find a fixed-point such that: 

y t =/(yt,A(L)y t _ 1 ,x t ,B(L)x t _ 1 ). 

A popular numerical method, implemented in many widely available pack¬ 
ages, such as E-Views, is the Gauss-Seidel method. Guass-Seidel method finds 
the fixed-point by iteration using the updating rule: 

y l +1 = /(y^A(L)yJ+{,x t ,B(L)x t _!) 

Gauss-Seidel solves the equations in the order that they appear in the model. 
So if an endogenous variables that has already been solved for appears later in 
some other equation, Gauss-Seidel uses the value as solved in that iteration. To 
illustrate matters, the fc-th variable in the 1-th iteration is solved by: 

ylt = fk{ylt,-,yl- lt ,y l kt , vl+u, -vlf 1 , A k (L)y t -1, x t , B k (l) x t _i) 

As a consequence the ordering of the variables matters and equations with 
relative few right-hand side endogenous variables should be listed early in the 
model. As the model is solved by setting disturbances to zero we have a determin¬ 
istic solution, a stochastic solution can easily be generated solving the model by 
adding drawings from random variables and taking expected values afterwards. 
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4.4 Policy Evaluation 

Dynamic simulation can be used to evaluate the effect of different policies, defined 
by specifying different patterns for the exogenous variables. Policy evaluation is 
implemented by examining how the predicted values of the endogenous variables 
change after an (some) exogenous variables is (are) modified. Policy evaluation 
implies simulating the model twice: first a baseline, control , simulation is run. 
Such simulation can be run within sample, in which case observed data are 
available for the exogenous variables, or outside the available sample, in which 
case values are assigned to the exogenous variables. In the case of out-of-sample 
simulation, which is equivalent to forecasting the endogenous variables for a 
given scenario for the exogenous variables, it is useful to assign values to the 
exogenous variables such that the baseline simulation path exhibits standard 
historical patterns for the endogenous variables. The results of such baseline 
simulation are then compared with those obtained from an alternative, disturbed, 
simulation, based on the modification of the relevant exogenous variables. Policy 
evaluation is usually based on dynamic multipliers. 

Consider the case of the simulation of a model over a sample of size T, and 
index by t the generic observation in that sample. Denote by xf the series of 
values attributed to the exogenous variable x in the baseline simulation, and by 
xf = xf +8 the series of alternative values attributed to the same variables in the 
disturbed simulation. Similarly, denote by yf t the solved value for the endogenous 
variable y n at time t in the baseline simulation and by yf t the solved value for 
the endogenous variable y n at time t in the disturbed simulation. 

The dynamic multiplier is the defined as follows 


(Vnt ~ V b nt) 




(xf - xf) 


(Vnt ~ Vnt ) 

8 


(4.37) 


When model are stable, long-run multipliers, obtained for large t, converge to 
fixed numbers. Note that in linear systems long-run multipliers can be also ob¬ 
tained by giving a temporary (one period) impulse to the exogenous variable and 
by then computing the cumulative response of the endogenous variables. 

To illustrate matters assume that the estimation over a given sample, say 
1960:1-1998:1, of a simple dynamic model for consumption and income, has de¬ 
livered the following results, similar to those obtained in the dynamic model of 
US consumption discussed in Chapter 2,. 


A c t = 0.25 * A y t - 0.15 * (c t _ 1 -y t -\) 

A y t = 0.008 

We aim at deriving the dynamic multiplier, describing the response of con¬ 
sumption to a one per cent increase in income by simulating the model over the 
period 1998:2-2020:4. 
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The following E-Views programme (run after having opened the file USUK.WF1) 
achieves the result: 

SMPL 1998:1 1998:1 
LCUS=LYUS 

SMPL 1998:2 2020:4 
model consinc 

consinc.append LCUS =LCUS(-l)-0.15*LCUS(-l)+0.15*LYUS(-1) +0.250*(LYUS-LYUS(-1)) 
consinc.append LYUS =0.008 +LYUS(-1) 

COPY CONSINC M_TEMP 

M_TEMP.APPEND ASSIGN @ALL _BL 

M_TEMP.SOLVE 

delete M_TEMP 

SMPL 1998:1 1998:1 

LYUS=LCUS+0.01 

SMPL 1998:2 2020:4 

COPY CONSINC M_TEMP 

M_TEMP.APPEND ASSIGN @ALL _DS 

M_TEMP.SOLVE 

delete M_TEMP 

SMPL 1998:1 2020:4 

genr DM=100*(LCUS_DS-LCUS_BL) 

plot DM 

SMPL 1998:1 1998:1 
LYUS=LCUS-0.01 

The programme begins by setting all variables at their long-run solution. 

Then the relevant model is constructed by defining it as CONSINC and by 
including the specification for the two equations. The Model CONSINC is then 
copied to a temporaray model, which is solved dynamically for the sample 1998:2 
2020:4, and the suffix _BL, for baseline, is attributed to the variables generated 
by the solution.In the following step the disturbed solution is generated by adding 
a one per cent shock for one period (1998:1) to LYUS. Note that, as LYUS has a 
unit root, the one-period shock has permanent effect. The disturbed solution is 
then computed and the suffix _DS is attached to the generated variables. Lastly 
the dynamic multiplier is computed by applying formula (4.37) , we report it in 
the following Figure : 

Having illustrated the basics with this simple case we move to discuss a 
more articulated model Cowles Commission model of the monetary tranmsission 
mechanism, by taking all steps from specification to simulation. 

4.5 A model of the monetary transmisssion mechanism 

4.5.1 Specification of the theoretical model 

We consider the close-economy IS-LM specification with autoregressive expecta¬ 
tions: 
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Fig. 4.1. 


Ut — c 0 ,n + Vt — ®13 {Rt — 7T() + Ci t (4.38) 

7T t = 7T( + a 2 i (y t - Vt) + C 2 t (4.39) 

7T( = 7T t _ 1 (4.40) 

ni t ~Pt = c 0 , 31 + a-iiVt - a 33 R t + e 3t (4.41) 

m t = c 0 , 41 +m. t - 1 (4.42) 

Vt = Co,51 + Co,52^ + C4t (4.43) 


Note that money supply is not stochastic as it is considered as fully controlled 
by the monethary authority. The econometrician’s task is the estimation of the 
unknown parameters to simulate the impact of different path for the exogenous 
variable controled by the monetary authority. The model uses four equations 
to determine four endogenous variables, ed y t , for given values of the 

two exogenous variables y* and m t .The exogeneity status is attributed to y* 
and m t , either because they describe the available technology and demography 
or because they are fully controlled by the policy-maker. Note that, under the 
hypothesis of dynamic stability of the estimated model, the estimated values 
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for the parameters will only determine the short-run dynamics of output and 
inflation, as the long-run equilibrium solutions are determined almost indepen¬ 
dently (totally independently in the case oq = 1 ) from the estimated parameters 
(ir = Am - a 1 Ay*, y = y*). 

4.5.2 Estimation of the parameters of interest 

We consider a monthly data-set for the US economy (which we take as a close 
economy) to construct, estimate and simulate a version of the macroeconomic 
model. The data-set, available in EXCEL format as LSZUSA.XLS contains the 
following variables for the sample 1959:7-1996:3 (for a complete description of 
the data-set see Leeper-Sims-Zha (1997)) 


CPISA: Consumer price index adjusted for seasonality 

M1SA: Ml stock adjusted for seasonality 

M2SA: M2 stock adjusted for seasonality 

PCM : IMF index of commodity price in dollars 

RGDP : real US GDP at quarterly frequencies 

RGDPMON : real US GDP at quarterly frequencies (quarterly data 
interpolated by Chow-Lin procedure) 

TBILL3 : annually compounded nominal return on three-month TBills 
TB0ND10: annually compounded redemption yield on 10-year TBonds 


Having imported the data in EXCEL format into EViews the following trans¬ 
formation are performed using the programme files LSZ.PRG . The program is 
listed as follows: 


genr lp=100*log(cpisa) 

genr ly=100*log(rgdpmon) 

genr infl=(lp-lp(-12)) 

genr rr=tbill3-infl(-l) 

genr lml=100*log(mlsa) 

genr Im2=100*log(m2sa) 

genr dl21m2=(lm2-lm2(-12)) 

genr dl21ml=(lml-lml(-12)) 

genr lyst=773.27+0.275*@TREND(1959:1) 

genr vel=lp+ly-lm2 


We initially deal with non-observable variables. We solve the problem of ex¬ 
pected inflation by setting (arbitrarily) (3 = 1 in equation (??)and substituting 
lagged inflation for expected inflation. We the obtain an observable proxy for 
potential output fitting a simple deterministic trend for output. 
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Table 1: LYST=C(1)+C(2)*QTREND(1959:1) 
Coefficient Estimate Std. Error t-Statistic Prob. 

C(I) 773.2748 0.466437 1657.833 0.0000 

C(2) 0.274909 0.002476 111.0279 0.0000 


R-squared 0.975006 
Adjusted R-squared 0.974927 
S.E. of regression 4.053267 
Sum squared resid 5191.555 
Log likelihood -895.2677 
Durbin-Watson stat 0.018264 


Mean dependent var 818.4973 
S.D. dependent var 25.59787 
Akaike info criterion 2.805316 
Schwarz criterion 2.828976 
F-statistic 12327.19 
Prob(F-statistic) 0.000000 


Note that from the estimated parameter values we have that potential ouptut 
grows at annual rate of (1+0.0027) 12 -1 = 0.0329 per cent. 

We proceed now to the estimation of all the structural relations included in 
the model. We begin by money demand, which we simplify to a linear relation 
between the log of velocity of circulation of money and the short-term interest 
rate, which we take as a proxy of the opportunity cost of holding money. 


_ Table 2: VEL =C(1)+C(2)*TBILL3 _ 

Coefficient Estimate Std. Error t-Statistic Prob. 


C(l) 527.9095 

C(2) 1.781791 

R-squared 0.724668 
Adjusted R-squared 0.723797 
S.E. of regression 3.359777 
Sum squared resid 3567.040 
Log likelihood -835.5954 
Durbin-Watson stat 0.117228 


0.431272 1224.075 0.0000 
0.061783 28.83933 0.0000 

Mean dependent var 539.0975 
S.D. dependent var 6.392876 
Akaike info criterion 2.430019 
Schwarz criterion 2.453679 
F-statistic 831.7067 
Prob(F-statistic) 0.000000 


Note that the semi-elasticity of the velocity circulation with respect to inter¬ 
est rate is 1.78, implying that increase of hundred basis point in short-term rates 
is paired with a 178 points increase in velocity circulation. Note that this is not 
the elasticity, in fact the elasticity r) r V el = 7 w hile the semi-elasticity 

sir ,vel = d{P t(R) m) > therefore r] rVEL = srj ryEL * R ,as (flog (i?) = ^§. There¬ 
fore by specifying the money demand with the log of real money as a function 
of the level of the nominal interest rate has the important implication of making 
the elasticity of money demand with respect to the opportunity cost of hold¬ 
ing money function of the level of interest rates. This is important in that it is 
not desirable to impose that the elasticity of money demand to interest rate is 
constant. 

The third relation we estimate is an aggregate demand curve: 
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Table 3: LY= C(l)+ LYST +C(2)*(TBILL3-INFL(-1)) 
Coefficient Estimate Std. Error t-Statistic Prob. 

C(l) 1.470880 0.274338 5.361572 0.0000 

C(2) -0.383906 0.102396 -3.749208 0.0002 


R-squared 0.970221 
Adjusted R-squared 0.970123 
S.E. of regression 4.186429 
Sum squared resid 5310.435 
Log likelihood -868.4866 
Durbin-Watson stat 0.021376 


Mean dependent var 820.4813 
S.D. dependent var 24.21989 
Akaike info criterion 2.870232 
Schwarz criterion 2.894627 
F-statistic 9871.903 
Prob(F-statistic) 0.000000 


Note that LYST is included in the fitted relation with a coefficient constrained 
to unity. As a consequence we can easily compute the level of long-run equilibrium 
real interest (as the real interest rate obtained by setting y = y*), such level is 
3.82=(1.47/0.38). 

The fourth estimated relation is the aggregate supply function: 


Table 4: INFL=C(1)*INFL(-1)+C(2)*(LY-LYST) 


Coefficient Estimate 


Std. Error t-Statistic Prob. 


C(l) 0.9996 

C(2) 0.027 


0.0032 319.10 0.0000 

0.0047 5.89 0.0000 


R-squared 0.99 
Adjusted R-squared 0.99 
S.E. of regression 0.32 
Sum squared resid 32.14056 
Log likelihood -89.62183 


Mean dependent var 5.100013 
S.D. dependent var 3.296546 
Akaike info criterion 0.60 
Schwarz criterion 0.63 
Durbin-Watson stat 1.54 


Note that the estimated values for the parameters are extremely close to the 
case of maximum price stickiness and the adjustment of inflation with respect to 
the gap between output and potential output is significant but extremely slow. 


4.5.3 Simulating the effect of monetary policy 

Having estimated the model, we are now in the position to proceed to simulating 
it by considering the estimated equations as a system of differential equations, 
which can be solved after the specification of a money supply function.This 
procedure allows the construction of a baseline, which can be used to evaluate 
the effect of monetary policy by specifying an alternative rule for monetary policy 
and by computing multipliers. The E-VIEWS programme SOLVED1.PRG allows 
the computation of the dynamic multipliers generated by an one per cent increase 
in money supply. The programme contains the following statements: 

’ This a program to compute dynamic multipliers 
if @isobject(’’m_temp’’)=1 then 
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delete m_temp 
endif 

if SisobjectC’’dfbase’’)=1 then 

delete dfbase 

endif 

if Sisobject(’’dfshock’’)=1 then 

delete dfshock 

endif 

’baseline simulation 
smpl 1986:01 2001:12 
’define growth rate of money 
genr x=6 
model dfbase 

’define exogenous variables 

dfbase.append Im2=lm2(-12) +x 

dfbase.append lyst=773.27+0.275*STREND(1959:01) 

’loadind endogenous variables 

dfbase.merge df 

copy dfbase m_temp 

m_temp.append assign Sail _bl 

m_temp.solve 

delete m_temp 

group exog_bl dl21m2_bl dl21yst_bl 
group endog_bl tbill3_bl infl_bl dl21y_bl 
’disturbed simulation 
smpl 1986:01 2001:12 

’define shock to the growth rate of money 

genr y=l 

model dfshock 

’exogenous variables 

dfshock.append Im2=lm2(-12) +(x+y) 

dfshock.append lyst=773.27+0.275*STREND(1959:01) 

’loading endogenous variables 

dfshock.merge df 

copy dfshock m_temp 

m_temp.append assign Sail _ds 

m_temp.solve 

delete m_temp 

group exog_ds dl21m2_ds dl21yst_ds 

group endog_ds tbill3_ds infl_ds dl21y_ds 

plot tbill3_bl tbill3_ds 

plot infl_bl infl_ds 

plot dl21y_bl dl21y_ds 

’computing dynamic multipliers 
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genr dm_tbill3=(tbill3_ds-tbill3_bl)/(x+y) 
genr dm_infl=(infl_ds-infl_bl)/(x+y) 
genr dm_dl21y=(dl21y_ds-dl21y_bl)/(x+y) 
group dm dm_tbill3 dm_infl dm_dl21y 
plot dm 

The first block of the programme defines objects that will contain the baseline 
model (dfbase) and the disturbed model (dfshock); it also defines a temporary 
object (m_temp) which will contain the model to be simulated in each round: 
if @isobject(’’m_temp’’)=1 then 
delete m_temp 
endif 

if @isobject(’’dfbase’’)=1 then 

delete dfbase 

endif 

if Sisobject(’’dfshock’’)=1 then 

delete dfshock 

endif 

A baseline simulation is then created. The simulation sample is first chosen; 
as the model has been estimated over the sample 1959:07-1985:12 we proceed to 
simulate in from 1986:1 onwards. In fact the sample for the simulation is purely 
artificial as all series are model generated when computing dynamic multipliers. 
Chosing a specific sample make sense only when historical values are considered 
for some of the variables. Having chosen the sample we set the rate of growth 
of money x at six per cent, the exogenous policy controlled variable. Then all 
the estimated equations in the previous section are included into the model. The 
exogenous variables are included using an append statement, while the endoge¬ 
nous variables are included by importing directly into model dfbase the model 
df, containing all the estimated equations. Then the model is solved dynamically 
by using Gauss-Seidel and the extension _bl is appended to all generated vari¬ 
ables. The variables are then grouped according to their status into exogenous 
and endogenous. 

’baseline simulation 
smpl 1986:01 2001:12 
’define growth rate of money 
genr x=6 
model dfbase 

’define exogenous variables 

dfbase.append Im2=lm2(-12) +x 

dfbase.append lyst=773.27+0.275*@TREND(1959:01) 

’loadind endogenous variables 

dfbase.merge df 

copy dfbase m_temp 

m_temp.append assign Sail _bl 
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m_temp.solve 
delete m_temp 

group exog_bl dl21m2_bl dl21yst_bl 

group endog_bl tbill3_bl infl_bl dl21y_bl 

A disturbed sumulation is then created following the same steps 

’disturbed simulation 

smpl 1986:01 2001:12 

’define shock to the growth rate of money 

genr y=l 

model dfshock 

’exogenous variables 

dfshock.append Im2=lm2(-12) +(x+y) 

dfshock.append lyst=773.27+0.275*@TREND(1959:01) 

’loading endogenous variables 

dfshock.merge df 

copy dfshock m_temp 

m_temp.append assign Sail _ds 

m_temp.solve 

delete m_temp 

group exog_ds dl21m2_ds dl21yst_ds 

group endog_ds tbill3_ds infl_ds dl21y_ds 

plot tbill3_bl tbill3_ds 

plot infl_bl infl_ds 

plot dl21y_bl dl21y_ds 

At the end of the block simulated values for the endogenous variables are 
plotted. Finally dynamic multiplier are computed , grouped into dm and plotted 
’computing dynamic multipliers 
genr dm_tbill3=(tbill3_ds-tbill3_bl)/(x+y) 
genr dm_infl=(infl_ds-infl_bl)/(x+y) 
genr dm_dl21y=(dl21y_ds-dl21y_bl)/(x+y) 
group dm dm_tbill3 dm_infl dm_dl21y 
plot dm 

We report in the computed dynamic multipliers in Figure 2: 

The one per cent increase in money supply has a one-to one impact on infla¬ 
tion in the long-run and a zero impact on deviation on GDP growth. Money is 
neutral in the long-eun but it does have a short-run impact on the output cycle 
as prices are sticky. As a consequence of price stickiness we also observe a short- 
run liquidity effect on interest rates while in the long run the Fisher relationship 
applies and monetary policy does not have any impact on real interest rates. 
Note that there is some cyclicality in the interest rate multiplier, this is due to 
the cycle in nominal interes rates generated by the model. In fact the cycle of 
output is not matched by any cycle in money supply, which has just a trend. 
As a consequence nominal interest rates reflect to some extent fluctuations in 
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DM INFL - DM D12LY -DM TBILL3 


Fig. 4.2. Dynamic Multipliers 


output. Setting money supply as completely exogenous generates artifical series 
incapable of replicating some feature of the observed data. We shall re-address 
this point later. 

At this stage we feel it is more important to concentrate on the reliability of 
the description of the response of the economy to monetary poicy derived from 
the dynamic multipliers. 

4.6 Assessing econometric evaluation of monetary policy. 

To have a first assessment of the reliability of our simulations we the following 
approach: we assume that the monetary authority has followed rule which deliv¬ 
ered the observed data on the money stock and, using such variable as exogenous, 
we endogenously generated the relevant macroeconomic time series foe a sample 
covering both the estimation (up to 1985:12) and the simulation (from 1986:1 
to 1996:3) period. Such result is obtained by solving the following version of the 
baseline model: 

assign Sail _fal 

lyst=773.27+0.275*@TREND(1959:01) 
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’Im2=lm2(-12)+10 

tbill3=(-1/1.758218)*(lm2-lp-ly) -(527.8596/1.758218) 
ly=lyst+l.47-0.383*(tbill3-inf1(-1)) 

(lp-lp(-12))=0.1 +0.975*(lp(-l)-lp(-13))+0.029*(ly-lyst) 

INFL=(lp-lp(-12)) 
dl21m2=lm2-lm2(-12) 
dl21yst=lyst-lyst(-12) 
dl21y=ly-ly(-12) 
cycle=ly-lyst 

Note that now the equation for lm2 is commented out: the model will now be 
solved by taking lm2 as exogenous ad using the historical values for this variable. 

We report in Figure 3 and 4 the simulated (defined with suffix _fal) and 
observed relevant macroeconomic variables: cycle (defined as the deviation of 
output from trend output) and inflation. 



Fig. 4.3. Observed and simulated cycle 

Shaded areas distinguish the simulation sample from the estimation sam¬ 
ple.The analysis of Figures 3-4 reveals two problems. Over the estimation period 
the simulated series do no have a sufficiently rich dynamics match the observed 
time series. However, there is no tendency for a systematic deviation of simulated 
series from observed series: the difference between model generated and observed 
time series has a long-memory but there is a pattern for a reversion toward the 
zero mean. When we revert to the simulation period the first problem persist 
and, in addition, we start observing a systematic pattern in the divergence be¬ 
tween simulated and observed time-series. Such evidence probably justifies some 






148 THE COWLES COMMISSION’S APPROACH 



Fig. 4.4. Observed and simulated inflation 


skepticism towards econometric evaluation of monetary policy. To further elabo¬ 
rate on this point we consider some diagnosis of the causes behind the problem to 
discuss the solutions proposed by recent developments in econometric modelling 
of the monetary transmission mechanism. 

4.7 What is wrong with econometric policy evaluation? 

The two problems with the our application of the Cowles Commission approach 
seems to be serious enough to warrant some discussion. We shall organize our 
discussion by dividing explanation in two classes: those explanations concen¬ 
trating on modifications in the estimation technique and those suggesting some 
modifications in he modelling strategy. 

The small structural model we have considered is estimated using single equa¬ 
tion OLS method. This method is clearly not appropriate, even by taking the 
”a priori”exogeneity assumptions on money supply and trend output as valid. 
Consider the money demand equation, alongwith the aggregate demand and sup¬ 
ply schedules. These relations establish a simultaneous feedback between output,, 
prices and the interest rate which make not appropriate OLS as the estimation 
method. In the velocity equation, for example, the nominal interest rate, the 
only stochastic regressor, should be correlated with the residual and therefore 
the OLS estimate of the semi-elasticity of money demand with respect to inter¬ 
est rate should be biased. Biased estimates of the parameters of interest could 
obviously explain the disappointing performance of the model under simulation. 
However this potential explanation does not seem to be the relevant one. We re¬ 
port in the following table the results of the estimation of the velocity equation 
by GIVE using valid instruments. 
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Table 5: VEL =C(1)+C(2)*TBILL3 


GIVE Estimation 


Coefficient Estimate 

Std. Error t-Statistic Prob. 


C(l) 527.41 0.439 1212 0.0000 

C(2) 1.85 0.06 29.46 0.0000 


Instruments:C TBILL3(-1) LY(-l) LP(-l) LP(-2) LM2(-1) 
R-squared 0.72 Mean dependent var 539.0975 

Adjusted R-squared 0.72 S.D. dependent var 6.392876 
S.E. of regression 3.34 Akaike info criterion 2.42 

Sum squared resid 3567.040 Schwarz criterion 2.44 

Log likelihood -835.5954 F-statistic 868 

Durbin-Watson stat 0.125 Prob(F-statistic) 0.000000 


Note that the GIVE estimates are not very different from the OLS. This 
result is robust to the consideration of full information estimation methods 2 . The 
observed problems under simulations do not seem to be explained by the chosen 
estimation method but rather by problems in identification and specification. 
These are the two issues closely addressed by modern aproaches to econometric 
modelling. The first problem, i.e. the incapability of the estimated model to 
capture the observed dynamics of the variables of interest, could be explained 
by the following considerations: 

• The statistical model implicit in the estimated structure is ’’too restrictive”. 
There are two interpretation of the excessive simplicity in the specification: 
omission of relevant variables, omission of the relevant dynamics for the 
included variables (note, for example, that the estimated money demand 
relation is a simple, static equation) 

• the identifying restrictions, altough necessary from to make the estimation 
meaningful, deliver a structure which cannot adequately describe reality. 
Think of money supply in the estimated model: if the monetary author¬ 
ity uses money supply as an instrument to achieve given targets for the 
macroeconomic variables, then it would be very ’’natural” for money supply 
to react not only the output and inflation but also to leading indicators for 
these variables.Assuming money supply as exogenous, the estimated model 
omits completely a very relevant feedback and looses an important feature 
of the data. Moreover, by assuming incorrectly exogeneity, the model might 
induce a spurious statistic efficiacy of monetary policy in the determina¬ 
tion of macroeconomic variables. The endogeneity of money does generate 
a correlations between macroeconomic variables and monetary variables, 
which, by assuming invalidly, money as exogenous could be interpreted as a 


2 A useful exercise 
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causal relation running from money to the macroeconomic variables (Sims 
critique). 

The worsening of the model’s performance under simulation could instead be 
explained by the following considerations: 

• incorrect specification. Omitted variables have an effect which, not detected 
when the model is estimated (possibly because the omitted variables were 
’’silent”) becomes relevant in explaining parameters’ instability of the es¬ 
timated equations in the simulation period. Incorrectly specified dynamic 
models feature parameters’ instability in out-of-sample simulations. 

• Model simulation implies considering alternative monetary policy regimes. 
A change in regime might imply a structural shift in the parameters of the 
estimated equations, therefore, the model estimated under the ’’baseline” 
regime cannot be used to evaluate the effect of the ’’control” policy. In 
other words the ’’Lucas critique” applies. 

In the next chapters we shall consider in turn all these explanations by dis¬ 
cussing all the alternative modern approaches to applied macroeconometrics. 
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5.1 Introduction 

The LSE approach explains the failure of the Cowles Commission methodology 
by attributing it to the lack of attention for the statistical model underlying 
the particular econometric structure adopted to analyse the effect of alternative 
monetary policies. The LSE methodology considers econometric policy evalua¬ 
tion an interesting and feasible exercise. However the way in which the Cowles 
Commission approach deals with a legitimate question is not seen as correct. The 
lack of sufficient interest for the statistical model is at the root of the failure of 
the Cowles Commission approach to provide an acceptable answer to an inter¬ 
esting question. As it can be seen form the application discussed in the previous 
chapter, the econometric analyses within the Cowles Commission tradition be¬ 
gin from the idea that the structural form of the process generating the data is 
known qualitatively, reduced form are then derived from such structures. Within 
such framework the validity of the reduced form is not tested. The LSE approach 
views this lack of validation of the reduced form as undermining the credibility of 
the structural parameter estimates. The LSE approach recognizes that economic 
theory suggests the general specification of the relevant form, but the precise 
representation of the Data Generating Process is almost never known in ad¬ 
vance. Thus modelling procedure are required to determine the credibility of the 
estimated models. The reduced form takes a central role within this approach in 
that it represents the crucial probabilistic structure of the data 1 . The traditional 
logic of the Cowles Commission, according to which the reduced form is derived 
given the structural model, is turned upside down within the LSE approach. The 
reduced form is defined first, by defining a system via the set of variables consid¬ 
ered, their classification into modelled and non-modelled variables (endogenous 
and exogenous in the traditional terminology) and the specification of the lag 
polynomials. The system is then validated by applying the three basic principles 
of econometrics: “test, test and test”. The null hypothesis of interest here being 
the absence of symptoms of mis-specification, such as residual non-normality, 
autocorrelation, heteroscedasticity, parameters non-constancy. If the null is not 
rejected and the system can be considered as a congruent representation of the 
unknown Data Generating Process, then non-stationarity can be dealt with and 
the long-run properties of the system can be identified by implementing cointe¬ 
gration analysis. Note again that cointegration analysis is fully implemented on 

■^See Spanos [21], Hendry, Neale and Srba [16], 
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the reduced form and the identification of the structural long-run relationships 
is a totally separated problem from the identification of the structural short-run 
simultaneous relationships. In the last step a structural model is identified and 
estimated. No further validation is possible for just-identified model as they im¬ 
pose no restrictions on the system, while the validity of over-identified models 
is testable by testing the validity of the over-identifying restrictions implicitly 
imposed on the reduced form. Finally, policy simulation can be performed after 
testing that the necessary requirement for the model to be robust to the Lucas 
critique, i.e. superexogeneity of the relevant variables for the estimation of the 
parameters of interest, is satisfied. 


5.2 The LSE diagnosis. 

The LSE diagnosis of the problems displayed by Cowles Commission models is 
simple: 

“...the statistical properties attributed to the structural estimators and 
related tests are in general invalid unless the probabilistic structure im¬ 
posed on the data via the reduced form is invalid. A glance at the em¬ 
pirical literature confirms that not only are the statistical assumptions 
underlying the reduced form not tested, but the reduced form is rarely 
estimated explicitly. Indeed, the most popular estimation methods for 
the structural parameters are limited-information instrumental-variable 
methods such as two-stage least squares which do not even specify the 
implied reduced form...” ([21],p.90) . 

Consider the structural model used to illustrate the Cowles Commission ap¬ 
proach in the previous chapter: 
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only parameters different from zero or one are to be considered “free” and 
are then estimated to describe the economic properties of the adopted structure. 
From this representation we immediately note the the exogeneity assumptions 
implies a remarkable number of the restrictions on the set of the parameters 
of interest. Another substantial set of restrictions derives from the very limited 
dynamics adopted in the specification of the model. The implied reduced form 
features a remarkable number of restrictions, as easily checked by pre-multiplying 
the model by A -1 : 
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According to the LSE criticism the validity of such reduced form is not 
properly addressed within the Cowles Commission tradition. Structural infer¬ 
ence based on an improper statistical model is the LSE diagnosis for the failure 
of the Cowles Commission approach model by evaluating the properties of the 
residuals. 


5.3 The reduction process 

Econometric modelling is formalized within the LSE camp as the result of a 
reduction process. The starting point of the reduction process is a long way up: 
think of a vector x t containing observations an all economic variables at time t. 
A sample of T time series observations on all the variables can be represented 
as follows: 


Xl 


X 


l _ 
T — 


X t 


The starting point of the reduction process is a model for the Data Generating 
Process (DGP). The DGP is described by the joint density function D (X^., 0) 
where X t _ i is the matrix including observations on all variables in x from time 
1 to time t — 1, and 6 is a set of parameters. 
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Model specification amounts to choosing a particular functional form for the 
density. Having chosen the model, a structure for the model is pinned down by 
identifying parameters and estimating them. In general, estimation is performed 
by considering the joint sample density function, known also as the likelihood 
function, which we can express as D (Xy | Xo,0) .The likelihood function is 
defined on the parameters space 0, given the observation of the observed sample 
Xy and of a set of initial conditions Xo. Such initial conditions can be interpreted 
as the pre-sample observations on the relevant variables (which are usually not 
available). In case of independent observations the likelihood function can be 
written as the product of the density functions for each observation. However 
this is not the relevant case for time-series, as time-series observations are in 
general sequentially correlated. In the case of time-series the sample density is 
then constructed using the concept of sequential conditioning. The likelihood 
function, conditioned with respect to initial conditions, can always be written as 
the product of a marginal density and a conditional density as follows: 

D (Xy | X o ,0) = D ( Xl | X 0 , 6) D (X| | X x , 6) 

Obviously we also have 

d (x||x 0 , e) = d (x 2 1 x 1 ,e)D (x| |x 2 , e) 

and, by recursive substitution, we eventually obtain : 

T 

D(x^ I Xo,0) = iy°( x ‘ I Xt-1,9) 

t = 1 

Having obtained D (X). | Xo, 0) we can in theory derive D (X)., 0) by integrat¬ 
ing with respect to Xq the density conditional on pre-sample observations. In 
practice this could be not tractable analitically as D (Xo) is not known. The 
hypothesis of stationarity becomes crucial at this stage, as stationarity restricts 
the memory of time series and limits to the first observations in the sample the 
effects of pre-sample observations. This is the reason why, in the case of sta¬ 
tionary processes, initial conditions can be simply ignored. Clearly the larger 
the sample, the better as the weights of the information lost becomes smaller. 
Moreover note also that, even by omitting initial conditions we have: 

T 

D(X%. | Xo,0) =£>( Xl | Xo,0)IjD(x t I X t _i,0) 

t =2 

therefore the likelihood function is separated in the product on T — 1 con¬ 
ditional distribution and one unconditional distribution. In the case of non- 
stationarity the unconditional distribution is not defined. On the other hand, 
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in the case of stationarity the DGP is completely described by the conditional 
density function D (x t | X t _ i,0). 

The base line of the reduction process, the DGP or the Haavelmo distribution, 
is then completely described by D (x t | X t _ i, 0). The first step of the reduction 
process can be understood by partitioning x in three types of variables: 


x t = (w t ,y t ,z t ) 


where w t identifies variables which are not observables or are not relevant 
to the problem investigated by the econometrician. In practice these variables 
are ignored, in theory such result is obtained by factorising the joint density and 
integrating it with respect to w t : 


D{ y t ,z t | Y t _i, Zt-i,/3) = JJ D( y t ,z t ,w t | Y t _i,Z t _i, W t -id) 

D( W t _i | Y t - 1 ,Z t - 1 ,G)dW t - 1 dw t 

In this case we have a potential information loss which becomes real when 
the variables judged irrelevant for the problem at hand are not so. In formal 
terms, we do not have any information loss only if: 


D{y t ,z t | Yt.^Zt.!,/?) = D( y t ,z t ,w t | Y t _i,Z t _i, W t _ 1 d) 

This is the statistical description of the model considered by the econome¬ 
trician, it is in other words the reduced form of the structure of interest to the 
economy. In general, at the empirical level, this is the earliest stage of the re¬ 
duction process, in fact a reduced form for all the variables of interest (a VAR) 
is the most general model we fitted to the data. However, such general model 
viable for empirical estimation does not certainly coincide with the Haavelmo 
distribution for all economic variables! 

How can we be sure that no loss of relevant information occurred in moving 
from the Haavelmo distribution to the estimated empirical model ? By apply¬ 
ing the three fundamental rules of LSE econometrics “test, test and test” to 
our reduced form. In fact D (y t , z t | Y t _i, Z t _i, j3) is empirically constructed by 
parameterising E(y t ,z t | Y t _i, Z t _i, j3) as follows: 
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From the specification of conditional means the vector of innovations u t is derived 
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Going back to our application to the monetary transmission mechanism, the 
baseline of the investigation is a reduced form of the type: 
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(5.2) 


Any empirical model is in itself the product of some step in the reduction 
process. So the starting point of the empirical analysis is the implementation of 
a battery of diagnostic tests, where the null hypothesis of interest is the validity 
of the baseline model as a simplified representation of the unknown DGP. 


5.4 Test, test and test 

Given that the GDP is unknown, the validity of reduction can be checked by 
ensuring that the vector of innovations u t possesses all the features of true statis¬ 
tical innovations: absence of correlation, heteroscedasticity, non-normality. Any 
pattern of this type or any instability in the /3 parameters can then be interpreted 
as a signal of a loss information occurred in the hidden reduction from the DGP 
to the particular estimated form adopted. The three fundamental principles of 
the LSE methodology are “test, test, and test” because only by implementing 
diagnostic checks we can discard invalid structural models. Testing usually con¬ 
centrates on residuals because any non-randomness in residual behaviour could 
be interpreted as a signal of incorrect specification of the underlying model. The 
residuals of a statistical model are generated by the specification adopted by 
the econometrician and are a by-product of omitted variables (both in the sense 
of omitted important variables and of omitted lags of included variables), and 
errors-in-included-variables of several type (measurement errors, expectational 
errors). We illustrate how the relevant tests can be constructed with reference 
to the statistical model (5.2). 

5.4.1 Testing autocorrelated residuals 

Residual autocorrelation is usually tested via a Lagrange Multiplier [4] test , 
which uses the following formulation: 
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(5.3) 


So residuals autocorrelation of the n-th order is checked by testing if the 
components of lagged fitted residuals not explained by the regressors in the 
original model are significant in explaining contemporaneous fitted residuals. A 
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test against the null of absence of serial correlation of order n is implemented by 
consider lags up to the n-th of fitted residuals. The null hypothesis of interest 
is: 


tfo : Si = 0 

The test, based on the R 2 of the auxiliary system, is asymptotically dis¬ 
tributed as a x 2 with nm 2 degrees of freedom, where m is the number of the 
variables entering the reduced form, see Godfrey [11], An F-approximation with 
small sample corrections is also available, see Kiviet [18]. The intuition of this 
procedure of model evaluation by variable addition is understood by considering 
that, under the null hypothesis, the component of lagged residuals not explained 
by the regressors in the model is not significant in explaining current residuals, 
see Pagan [20]. 

5.4.2 Testing heteroscedastic residuals 

To illustrate tests for the null of homoscedasticity, consider the simple case where 
we have a system including two variables, one monetary variable and one macroe¬ 
conomic variable. After estimation of (5.2) , a tests can be performed by running 
the following auxiliary model: 

( \ /e lt \ 

— So + D* (L) I Mt_! I + I e2t I (5-4) 

) \e 3t J 

the variance-covariance of the system residuals is con- 



Under the null hypothesis 
stant. Hence, we have 


H 0 : D* (L) = 0. 

The test is easily generalized to systems of m variables, with the proviso that, 
as m gets large, the limitation in degrees of freedom might make it not feasible. 
The procedure is best interpreted as an extension of the heteroscedasticity tests 
proposed by White [23] in the context of single-equation models. Of course, 
whenever the degrees of freedom problem is binding, a White test can be run on 
all the equations separately. Not rejecting the null in this case would satisfy a 
necessary condition for homoscedasticity of the sytem residuals. The condition 
is not sufficient because it does not provide a test for constancy of covariances. 
At the single equation level, ARCH type of tests could also be run, see Engle [6], 
by specifying the following models: 
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Where the null of interest is: 


tfo : Si = 0 

Note that all the test for heteroscedasticy here presented take some estimate 
of the variance-covariance matrix of the system residual and check its constancy 
over time. The difference between different tests lies in the specification of the 
alternative, i.e. of the variables used to capture the fluctuations over time of the 
moments under the alternative distribution. 

5.4.3 Testing residuals normality 

Normality of residuals is a crucial property in that all the statistical framework 
used to “test, test and test” is based on this assumption. A vector normality 
tests has been proposed by Doornik and Hansen [5]. 

The test is constructed by first standardising the residuals ( ). Define 

the vector of standardised residuals (r^...,^) as R. So C = T 1 R'R, is the 
correlation matrix. The standardised residuals, normally distributed under the 
null with zero mean and variance-covariance matrix C, can be transformed into 
independent standard normals: 


e t = EA 2 E'r t 

where A is a diagonal matrix with the eigenvalues of C on the principal 
diagonal and the columns of E are the correspondent eigenvectors, such that 
E'E = I, and A = E'CE. 

The test is performed by computing univariate skewness and kurtosis of each 
transformed residuals and comparing them with those of the normal distribu¬ 
tion. Define = (bu r ..,bi m ) , = (621, , &2™) , as the vectors containing the 

sample estimates of the skewness and kurtosis of the transformed residuals of 
the m equations included in the model we have that the test statistic: 

Tb^bi T(b 2 —3i) (b 2 — 3i) asy 2 , , 

“6“ + -24- X (2m) 

where i is the unit vector. As the above requires large samples, corrected 
versions are proposed and implemented in the PC-FIML package. 

5.4.4 Testing parameters stability 

Within the LSE methodology variable parameters is an oxymoron. In fact, Hendry[15] 
makes clear that “...Models which have no set of constancies will be useless for 
forecasting the future, analysing economic policy, or test economic theories, since 
they lack entities on which to base those activities...” . Testing parameter con¬ 
stancy is therefore an important aspect of the diagnostic checking procedure. 
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This is usually done within the LSE tradition by estimating models recursively 
and applying Chow [3] tests for parameters stability . 

Single equation Chow tests include 1-step E-tests, break-point E-tests and 
forecast E-tests. 

1-step forecasts tests are F (l,t — k — 1) under the null of constant param¬ 
eters, for t = N,...T and k included regressors. A typical statistic is calculated 
as: 


(RSS t — RSSt-i) (t — k — 1) 

RSSt-! 

Where RSS t is the residual sum of squares computed from the estimation 
on t observations. And they are computed by PC-GIVE and PC-FIML for all 
possible break points after initialization of the estimation. 

Break-point E-tests are E(E — t + 1, t — k — 1) for t = N, ..T. The null of 
interest is the stability of parameters when model is estimated on the sample 1 
to t against an alternative which allows any form of change over t + 1 to T. A 
typical statistic is calculated as: 

(RSS T ~ RSS t - i) (t-k-1) 

RSS t -!(T - k - 1) 

Forecast E-test are F(T — N + 1, M — k — 1) for t = N, ..T, they test stability 
of the model estimated on the sample 1 to (N — 1) against an alternative which 
allows any form of change over N to T. A typical statistic is calculated as: 

(. RSS T - RSS N _ x ) (N — k — 1) 

RSS N -! (T-N-l) 

All these tests can be extended to systems by defining F-approximations to 
likelihood ratios statistics [4], 

Chow tests are tests for instability generated by a single-break point, occur¬ 
ring at a known date within the sample. Refinements of the testing procedure 
have been proposed to deal with breaks occuring at uncertain dates and with 
multiple breaks. Andrews [1] proposes to deal with uncertainty by using trim¬ 
ming points to define a subsample in which the break has likely occurred , by 
then computing all possible Chow tests (in y 2 form) for every breakpoint. The 
largest statistic so obtained provides a stability test (“maximum Chow” test) 
for an unknown break point. The article provides the underlying distributional 
theory and critical values, which are function of degres of fredom and trimming 
points. 

5.5 Testing the Cowles Commission model 

We consider as a baseline modelling the following generalization of the statistical 
model underlying the simple Cowles Commission specification: 
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We estimate the abover specification by OLS, using PC-FIML, over the sam¬ 
ple 1959:7-1985:12. The residuals for the four equation in the system are reported 
in Figure (5.1), while diagnostic tests are reported in Table 1. 





Fig. 5.1. 
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TABLE 1: Diagnostic tests 



AR1-7 F( 7,267) 

N ormality X 2 (2) 

ARCH7 F( 7,260) 

XF 2 F(50,223) 

LY 

0.7691 

4.9172 

0.72095 

0.60953 


[0.6137] 

[0.0856] 

[0.6543] 

[0.9807] 

INFL 

2.671 

9.9844 

0.922 

1.5344 


[0.0109] 

[0.0068] 

[0.4898] 

[0.0196] 

LM2P 

1.9475 

21.096 

2.4126 

1.1241 


[0.0625] 

[0.0000] 

[0.0208] 

[0.2807] 

TBILL3 

2.1913 

162.11 

20.494 

4.7914 


[0.0353] 

[0.0000] 

[0.0000] 

[0.0000] 

Vector 

2.4529 

214.5 

_ 

1.3827 


[0.0000] 

[0.0000] 


[0.0000] 


The plot of the residuals and the results of the diagnostic tests reported in 
Table 1 make a point: the adopted specification does not deliver an acceptable 
statistical model. Given that this model is more general than the simple specifi¬ 
cation used to illustrate the Cowles Commission approach, the results is valid a 
fortiori for such model. 

5.6 Searching for a congruent specification 

In the previous section we have illustrated the diagnosis of the problems of the 
Cowles commission models proposed by LSE. We consider now the prognosis: 
begin the search of the final specification starting from an appropriate statis¬ 
tical model for the data. Looking at the behaviour of the residuals from the 
previous estimated model we note that the equation for the interest rate shows 
a substantial degree of instability over the period 1979-1982. In fact in this pe¬ 
riod a different monetary regime has been adopted by the Fed who abandoned 
a strategy aimed at controlling interest rates to embrace a non-borrowed re¬ 
serves targeting regime. As a consequence the volatility of short-term interest 
rates changes dramatically over the period 1979-1982. Such volatility goes back 
to pre-1979 levels only when non-borrowed reserves targeting is abandoned at 
the end of 1982 (see Walsh [22]) motivation to be introduced here. Mixing two 
different policy regimes is a recipe for parameters instability, therefore we con¬ 
centrate on a single regime and shorten the sample to end estimation in 1979:10. 
A second problem is detected by the diagnostics in the equation for inflation. 
Several outliers here are generated by the oil price shocks of 1973 and 1979. 
To fix this problem we include among the variables in the system an index of 
commodity prices. Such variable could also be important in modelling the mone¬ 
tary policy maker behaviour, if it plays a role as a leading indicator for inflation. 
Lastly to model properly money demand it seems necessary to consider explicitly 
the own return on money in the construction of the opportunity cost of holding 
this asset. A time series for this variable is made available by the Fed at the 
internet site http//www.??.??. . we extend our baseline system to include such 
new variable. We re-estimate the system over the shortened sample with the new 
endogenous and exogenous variables. As the residuals show some persistent sign 
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of non-normality we include a set of dummies to remove outliers (observations 
generating observed residuals of a magnitude exceeding, in absolute value, three 
times the standard deviation of fitted residuals). We then choose the following 
as our baseline model: 


Vt 

TTt 

RT 

Rb t 

(m — p) t 


do,n ^o,r 2 
do ,21 do ,22 
do,:31 do,32 
do, 41 do,4,2 


1 

TREND 


6 


+^2 D i 


yt-i 

t—i 

R?-> 

R U 

(' m-p) t -i 


6 

A i2LPCM t -i + g'DUM t + 

i =0 


U it 

U2t 

U3t 

U4t 


(5.6) 


DUMj is a vector of dummy variables containing: dum7306,dum7307, dum7308, 
dum7310, dum7311, dum7312, dum7402, dum7403, dum7407, dum7408,dum7409, 
dum7501, dum7505, dum7806, dum7808, dum7811, dum7904. In general, dum- 
MMYY is a variable taking value 1 in the MM month of the year YY and zero 
anywhere else. 

We plot the residuals in Figure (5.2) and report the diagnostic tests in Table 

2 . 


TABLE 2: Diagnostic tests 



AR1-7 F( 7,267) 

Normality X z (2) 

ARCH7 F( 7,260) 

XT'2 F( 50,223) 

LY 

0.92159 

1.577 

1.0196 

0.82561 


[0.49r3] 

[0.4545] 

[0.4496] 

[0.7929] 

INFL 

1.4775 

3.1857 

0.38832 

0.59005 


[0.4787] 

[0.2033] 

[0.9084] 

[0.9874] 

LM2P 

2.0011 

3.04 

0.49178 

0.69226 


[0.0580] 

[0.2487] 

[0.8395] 

[0.9448] 

M20WN 

5.2877 

12.04 

0.70325 

1.0182 


[0.0000] 

[0.0024] 

[0.6693] 

[0.4606] 

TBILL3 

1.0123 

6.6683 

2.8439 

0.89271 


[0.4246] 

[0.0356] 

[0.0084] 

[0.6835] 

Vector 

1.4779 

24.745 

_ 

0.66171 


[0.0004] 

[0.0058] 


[r.oooo] 


The situation looks much improved now, although the equation for the own 
rate on money shows still some problem of autocorrelation and non-normality, 
signalled both by the single-equation and the system diagnostics. We attribute 
this problems to the very peculiar time-series behaviour of this series and decide 
to proceed further in our analysis by considering 5.6 as a congruent representation 
for the unknown Data Generating Process. 
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1965 1970 1975 1980 


Fig. 5.2. 


5.7 Cointegration Analysis 

The next step in the specification strategy is the identification of the long-run 
equilibria in our model. The number of cointegrating vectors can be detected by 
applying the Johansen procedure to identify the rank of the matrix II in the 
following re-par ameterisation of our model: 
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We have already seen in Chapter 2 that two identification schemes deliver 
alternative representations of the long-run equilibria based on over-identifying 
restrictions not rejected by the data. The first scheme is centered upon a money 
demand relation, while the second is centered upon an interest rate reaction 
function. We have also shown that the analysis of the adjustment parameters 
makes the second scheme preferable. On the basis of these results, we opt for 
the second identification scheme and proceed to specify a structural model for 
the policy rates, output and inflation. Within such scheme, real money is com¬ 
pletely determined by the demand side and looses any interest for the analysis 
of monetary policy. When the researcher looses interest in real money the re¬ 
turn on money becomes also uninteresting. Our economic interpretation of the 
results of the cointegration analysis makes our baseline reduced form unneces¬ 
sarily complicated. The natural question at this point regards the legitimacy of 
a simplification of the model in moving from the reduced form to the structural 
model of interest. 

5.8 Specifying the structural model 

Having validated the reduced form, the econometrician is left with the problem 
of identifying the appropriate structure. Moreover, we have seen that the reduced 
form might constitute in itself a model unnecessarily complicated for the problem 
at hand. It is then important to identify the cases in which further simplification, 
obtained by reducing the dimension of the estimated system, is viable with no 
loss of relevant information for the purposes of analysis. 

5.8.1 Exogeneity 

Suppose that the relevant problem is inference on subset j3 1 of the parameters 
determining the joint density of y t , and z t . In general it is always possible to 
re-write D(y t ,z t | Y t _i, Z t _i,/3) as follows: 


D{ y t ,z t | Y t _!,Z = D(y t \ z t , Y t _!, Z t _ ! ,/3 1 ,/3 2 ) D (z t | Y t _ 1 ,Z t _ 1 ,(3 1 ,(3 2 ) 

(5.7) 

The general case admit as a specific case the existence of a “sequential cut”, 
which we represent as follows: 


D (yt,Zt | Yt_i, Z t _i, 0) — D (y t | z t , Y t _i, Z t _i, /3 X ) D (z t | Y t _i, Z t _i, /3 2 ) 

(5.8) 

If this is the case and if the set on which the parameters /3 x are defined is 
totally independent from the set on which the parameters (3 2 are defined (/3 1 and 
(3 2 are variation free ) then inference on /^could be performed by concentrating 
only on the conditional density for y t , without explicitly treating the marginal 
density for z t . To have an intuition of this argument think of the problem of 
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deriving an estimator of f3 1 by using the (5.8) as the likelihood function. Taking 
logs of (5.8) we have: 


logD(y t ,z t | Y t _ 1 ,Z t _ 1 ,0) = log D (y t | z t , Y t _!, Z t _ 1 ,f3 1 ) + logD (z t | Y t _ 1 ,Z t 

(5.9) 

from which it is clear that the log of the joint process is equal to the sum 
of two factors. The second factor is a constant with respect to j3 1 , it does not 
affect the maximum likelihood estimator of f3 1 and can be ignored when the 
main interest of research is inference on /3 1 . When the sequential cut can be 
operated and j3 1 and /3 2 are “variation free”, z t is said to be weakly exogenous 
for the estimation of /3 1 . Weak exogeneity can be confronted with Granger non¬ 
causality (see Granger [12]). z t Granger-causes y t if the knowledge of z t helps the 
prediction of yt+j, j > 0. Granger- causality is independent from the choice of 
the parameters of interest, while weak exogeneity obviously is. As a consequence 
it is perfectly admissible that z t is not Granger-caused by y but these variables 
are not weakly exogenous for the estimation of the parameters of interest. Think 
of the following case: 


D{ y t ,z t | Y t _i,Z t _i,/3) = D (y t 


z t , Y t _i,Z t _i ,f3 1 ,f3 2 ) D (z t 


Z t -i,/3i,/3 2 ) 

(5.10) 


The link between Granger causality and weak-exogeneity is established by 
the concept of strong-exogeneity, which is defined as the intersection of the two 
concepts, therefore we have strong-exogeneity when the joint density can be 
factorised as follows: 


£>(yt,z t | Y t _ 1 ,Z t _ 1 ,/9) = £>(y t |z t ,Y t _ 1 ,Z t _ 1 ,/9 1 )£>(z t | Z t _!,/3 2 ) (5.11) 

Weak exogeneity constitute the basis for the definition of a third concept 
of exogeneity: super-exogeneity. Superexogeneity requires weak exogeneity and 
that the conditional model D (y t | z t , Y t _i, Zt-i,/^) is structurally invariant, 
i.e. changes in the ditribution of the marginal model for z t do not affect the f3 1 
parameters. 

These three concept are useful to define the validity of the reduction from 
the data congruent reduced form and the adopted structural model: 

• if the objective of the analysis is inference on the j3 1 parameters, then 
the joint-density can be reduced to a conditional model if z t is weakly 
exogenous for the estimation of the parameters of interest; 

• if the objective of the analysis is dynamic simulation, then the joint-density 

can be reduced to a conditional model if z t satisfies the conditions for 
strong exogeneity; 
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• if the objective of the analysis is econometric policy evaluation, then 
the joint-density can be reduced to a conditional model if z t satisfies 
the conditions for super-exogeneity. 

Tests for the validity of all these three concepts have been devoleped to sustain 
the validity of the last stage of the reduction processes. 

5.8.2 Exogeneity in ECM representations 

To illustrate how the concepts of exogeneity are applied to linear dynamic models, 
consider the following DGP: 


y t = a0 12 z t +e lt (5.12) 

Sit = psit-i + 0 < p < 1 

a 02 iVt + a0 22 z t = al 21 y t -i + al 22 z t -i + s 2t 
s 2 t = S 2t -1 + u 2t 


f U lt 
\u 2t 


N.I.D. 


on 0 \ 

0 CJ22 ) 


This is a non-stationary process, integrated of the first order, admitting one 
cointegrating vector. The non-stationarity of the process stems from the presence 
of a unit root in £i t ,while the cointegrating vector is defined by y t — a\ 2 z t , as £\ t 
is stationary. The system(5.12) can be re-parameterised as follows: 


Aq 


A y t 
Az t 


= A, 


Ay t -i 

Az t -i 


+ C 


yt -1 

z t - 1 


Ult 

U 2t 


(5.13) 


Ao 

C 


1 -a0 12 \ A = (0 0 \ 

a02i a022 J ’ 1 yfll2r fl l22 J 

-(1 ~ P) a0 12 (l-/9)\ 

0 0 j 


C = Aocc/3' 


Note that (5.13) can be considered a congruent representation of the DGP, 
as it features well-behaved residuals. This is not true of (5.12) which features 
autocorrelated residuals. Autocorrelation is generated by the omitted first order 
dynamics in the static equation. The omitted dynamics admit specific restrictions 
known as COMFAC (common factor restriction) 2 . 

2 The common factor restriction is singular in that the effects of the omitted dynamic can 
be cured by a Cochrane-Orcutt estimator (static model+autocorrelated error terms). 
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To analyse the different concepts of exogeneity, consider the probabilistic 
structure of the data underlying model (5.13). In other words, let us derive the 
reduced form associated with (5.13) : 


f&Vt 

\Az t 



(5.14) 

(5.15) 

(5.16) 


a O22 — a02ia0i2 

From the reduced form we have that the conditional joint density of y t and 
z t can be written as follows: 


A y t 
A z t 


it -1 


N.I.D. 


where 


(5.17) 


a0i2al2i . a0i2al22 . a022 (1 — p) , , 

Vy = - z -Ay t _i H---A2:4-1--- {y t -1 - a 0 12 zt-i) 


k 

a l 21 * 

/A = -r-^Vt-i 


k 


k 


C 1 I 22 a , a0 2 i (1 - p) , a 

— 7 — Azt-i H- - - {y t -i - a0 12 zt-i) 


O'zz — 


fl 022\ 2 f a0i2\ 2 

—) ail+ {—) " 22 

a°2iV /lV 

—) ail+ {k) CT22 
a022\ ( a02i\ ( a0i2 

k )[ k ) ai1 


&22 


By applying the known properties of the multivariate normal, we derive from 
the statistical representation of the data the conditional mean of A y t with respect 
to A Zt and I t -\ as follows: 


E (A y t | A z t , I t - r) = n y + — {z - y z ) (5.18) 

CC zz 

z t is said to be weakly exogenous for the estimation of the parameters of 
interest if the conditional mean for A y t derived from (5.18) coincides with the 
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conditional mean for A y t derived from the first equation of model (5.13). As the 
conditional mean from the first equation of (5.13) is : 


E (A y t | A z t , I t -i) = a0 12 Az t - (1 - p) (y t _i - a0i 2 ^ t -i) (5.19) 

Weak exogeneity of A z t for the estimation of the parameter a0i 2 is obtained 
when a0 2 i = 0. 

Strong exogeneity requires Granger non-causality in addition to weak exo¬ 
geneity, strong exogeneity is satisfied when a0 2 i = 0, al 2 i = 0. 

Super-exogeneity requires weak-exogeneity and independence of the parame¬ 
ters of interest from the distribution of A z t . In our example, whenever the con¬ 
ditions for weak-exogeneity are satisfied, super-exogeneity also holds. To show a 
case in which this does not happen, consider the following modification of our 
DGP: 


Vt — CLO 12 E {zt+\ | It) +£1 1 (5.20) 

z t = al 22 £ t _i + e 2t 



N.I.D. 


on 0 \ 

0 022 j 


In this case the conditional mean E ( y t \ z t , It-i) is given by the following 
expression: 


E {y t | z t , I t _i) = a0 12 al 22 £ t (5.21) 

and it depends on al 22 , the parameter determining the conditional mean of 

z t - 

We conclude with the following remarks: 

• exogeneity is defined independently form the parameters defining the coin¬ 
tegrating vectors, but it is related to the weigths. Weak exogeneity has a 
precise relation with the direction of adjustment in presence of disequilibria 

• this is a special case, as we take a diagonal variance-covariance matrix. In 
general weak exogeneity requires a lower triangular structure and absence 
of correlation in the variance covariance-matrix. 

• note the impossibility of reverse regression. The condition of weak exo¬ 
geneity of A Zt for the estimation of a0i 2 are mutually exclusive with the 
condition of weak exogeneity of A y t for the estimation of 
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5.8.3 Testing exogeneity 

The preceeding example shows how weak-exogeneity can be tested for within the 
framework of cointegration. To provide a more general introduction to the issue 
of testing exogeneity consider a bivariate process for two generic variables y t and 
z t conditioned with respect to the information available, which includes all past 
history for the process: 


Vt 

z t 


It-i N.I.D. 


W yz 
G t °t 

z I 1 \ srV Z n ZZ 
t °t 


the conditional model for y t can be written as: 


(y t \zt,It-i)N.I.D ' 4 
and the marginal model for z t is instead: 


n yz f n yz 

°t \°t 


(5.22) 


(5.23) 


(z t | I t -i) N.I.D (fj , z , of z ) 

The parameters of interest feature the following relationship: 


(5.24) 


Tt = PtI + w ^ 

of which, for the sake of exposition, we consider the special case: 


(5.25) 


Vt = Pzt+w'^St+Ut 


(5.26) 


where w t _i is included in 7 t _ i. 

Weak exogeneity of z t for the estimation of (3 implies that this parameter 
could be estimated directly from (5.26)without any loss of relevent information. 
For this to happen, it is necessary that we have a sequential cut and that the con¬ 
ditional model does not depend on jj,^,a yz , o\ z . To pin down formally conditions 
for weak exogeneity substitute (5.25) in (5.23) obtaining: 


(y t | z t , I t -i) N.I.D 


(3z t 


w 




(7 


yz 


T zz 


-f3 (z t -ix z ), a yy - 


yz 


(5.27) 


Therefore we have weak exogeneity of z for the estimation of /?, if the following 
condition is satisfied: 
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From this condition we can easily understand test for exogeneity available in 
the literature (Hausman[13], Wu[24]) and based on two-stages procedures. 

In the first stage fif is parameterised by fitting a conditional model for z t of 
the following type: 


z t = Sj7r + u t (5.28) 

Where vector s includes all variables necessary to obtain a satisfactory spec¬ 
ification for z t . In the second stage the significance of residuals from (5.28) in 
equation (5.26) and the null of weak exogeneity coincide with the null of non¬ 
significance of such constructed variable. The argument here can be extended to 
test the null of super-exogeneity 3 . The alternative hypothesis is now complicated 
as follows: 


n ZZ 

P (rf ,<) = Po + Pirf + /vr +Ps -T- (5.29) 

Pi 

and the null of interest is weak exogeneity augmented by f3 1 = (3 2 = P 3 = 0 
To see ho the test is derived, substitute from (5.29) (5.25) in (5.23) obtaining: 


{Vt | z t , J t _i) ~ N.I.D {fJ, t ,Q t ) 


(5.30) 


Rt = PoZt+w' t -i8 + 


T yz 


a 


Po 


(Zt - Ut) + P! (R?) 2 + P2°?ti + P 3 <r? 


Q, t = o 


yy 

t 


by using the first-order expansion: 




-^ = 60 + 5^ 

°t 


we reach the following estimable relation: 


3 See Engle and Hendry [7], Favero and Hendry [10], Ericsson and Irons [8]. 
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Vt = /Vt + + (So ~ Po) (z t - i4) + (z t - tf) + p x (rf) 2 + /? 2 af ^ + fa™ 

where the null hypothesis of interest can now be empirically tested by pa- 
rameterising the first two moments of the conditional model for z t . 

Hendry [14] provides an alternative assessment of superexogeneity by analysing 
the encompassing implications of feedback versus feedforward models. 

This procedure is based on the explicit consideration of two alternative spec¬ 
ifications for the DGP. 

The feedback model, denoted H b , is: 


y t =l3'zt+vt (5.31) 

E b (z t v t ) = 0 

The feedforward model, denoted Hf, is: 

y t = S'E (z t+1 I It) + e t (5.32) 

z t = 7 t z t _i+u t (5.33) 

E f (z t e t ) = 0 

u t - i.d. (o,n t ) 


Note that in Hf the parameters of the marginal model for z t are function of 
time through r y t and Q t . Moreover, we restrict ourselves to the case in which the 
only relevant information in I t to predict z t+1 are the realizations at time t of z. 

We can now explore the encompassing predictions of each model for the other. 
We do so by evaluating the performance of each model when the congruent 
representation of the DGP is the alternative model. 

When a (5.31) and (5.28) are a congruent representation of the DGP, the 
following implications hold: 

1. When 7 t and Q t are non-constant, also the projection of y t on z t _i is 
non-constant, in fact 


E b (yt | z t _i) = /3'7 t z t _i 

2. the error-variance is also non constant: 


Vt ~ E b (y t | z t _i) = P'u t +v t = ip t 

Eb (vl) =vl+ P'^tP 
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3. the projection of y t on z t _ishould fit worse than the behavioural model 
[?], in fact: 

E b ((ft) = a 2 v +(3'Q t (3 >a 2 v 

4. the behavioural model ([?]) should feature constant parameters 

When instead (5.32) and (5.28) are a congruent representation of the DGP, 
the following implications hold. 

1. the conditional model cannot be constant when the marginal model for z t 
is sufficiently variable since: 

E f (y t | z t ) = <5'7 t z t -i 

2. the projection of y t on z t _i is non constant but with parameter vector 

3. no variance ranking is possible as: 


Vt ~ E f (y t | z t ) = e t 


as in (5.32). 

The analysis of the encompassing implications of the two cases reveals that 
when the feedback model is stable and the marginal process is not stable, then the 
feedforward specification cannot be a congruent representation of the DGP. As a 
consequence, the relevance of the Lucas critique could be analysed by assessing 
simultaneously the stability of the feedback structural model and the stability 
of the marginal models for the regressors in the feedback model. 

This procedure deserves some discussion. 

A first observations is related to the power of tests for structural stability, in 
fact for the procedure to work it is essential that the marginal model is sufficiently 
variable and that such variability is detectable through tests for parameters’ 
stability. As we have already seen, the issue is not trivial in that multiple breaks 
at unknown points are not so easily detected. 

Setting aside the power of the tests, there is a logical issue related to the re¬ 
duction procedure. In fact, if parameters stability is taken as one of the criteria 
for congruency, then congruent reduced forms should never feature parameters 
instability. In practice, as we have seen in our application, congruent specifica¬ 
tions often need the inclusion of dummies. Therefore, the significance of the same 
dummies in different equations of the adopted model could be exploited to apply 
the procedure for the evaluation of the relevance of the Lucas critique. 

A related question refers to the power of the procedure in the case of limited 
information, i.e. the case in which the parameters instability is generated by 
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omitted variables in the marginal models. Hendry [14] considers explicitly such 
case, by adopting the following alternative specification for the marginal model: 


z t = 7 1 z t _i+7 2 s t _i+u 2t (5.34) 

u 2t ~i.d.(0,n 2 ) (5.35) 

when (5.34) is a congruent representation of the DGP, (5.28) features insta¬ 
bility because of a limited information problem: the omission of s t _i from the 
relevant information set. However, if (5.28) is observed, then the relation between 
z t and s t cannot be constant, in fact it must be the case that: 


x t = Mt z t + £ t 

~ i-d- (0, St) 


and then 


It = 7i +72 

= n 2 + 7 2 St72 

and the result of stability of the feedback model paired with the instability 
of the (mis-specified) marginal model can still rule out the congruency of the 
feedforward model. 

5.9 A model of the monetary transmission mechanism 

To illustrate the specification of a structural model for the monetary transmission 
mechanism we consider as a baseline the cointegrated reduced form discussed in 
one of the previous sections: 
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R b t *_i = 7r t _! + Q.22y t -i -0.08 1 


(5.36) 


Note that (5.36) is the result of the reduction of the baseline model, which 
contains five equations. The original model delivered two cointegrating relation¬ 
ships, we have identified the first one as an interest rate reaction function and 
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the second one as a rule for determining the interest rate on bank deposits. To 
describe the monetary transmission mechanism under interest rate targeting we 
need to supplement the interest rate reaction function with equations for the 
target variables, inflation and output. Real money, being demand determined, 
looses interest and so does the opportunity cost of holding money. Therefore we 
have omitted from the original model the two equations determining real money 
and the interest rate on bank deposits together with the equilibrium relationships 
for the interest rate on bank deposits. The validity of this step in the reduction 
process is testable. Congruency of our selected specification requires that the 
weights on the second cointegrating vector can be constrained to zero in our 
three maintained equations, that the weights on the first cointegrating vector 
can be constrained to zero in the equations for real money and the interest rate 
on bank deposits, and finally that lagged value of real money and the interest 
rate on bank deposits do not enter significantly system (5.36). Having asserted 
the validity of this further step in reduction, we proceed to the specification of 
the following structural model: 

A y t = 0.23 + 0.33 An t _! - 0.21 AR? , - 1.19 DUM7312 - 1.11DUM7308 

(0.03) (0.06) (0.09) (0.46) (0.46) 

(5.37) 

A 7 r t = —0.05+ 0.08An t _i + 0.06An t _s + 0.2 A7r t _5 + 0.14A7r t _6 (5.38) 

(0.02) (0.03) (0.03) (0.05) (0.05) 

+ 0.03 A 12 LPCM t _ 1 - 0.02 A 12 LPCM t _ 2 ~ 0.92 DUM7307 

(0.005) (0.006) (0.27) 

+ 0.87 DU M7308 + 0.69 DUM7407 - 0.56 DUM7408 + 1.04DCM7409 

(0.26) (0.26) (0.25) (0.25) 

— 0.75 DU M7505 

(0.25) 


A R b t = -8.34 + 0.31 A7r t + 0.08 Am_ 3 + 0.36 A R b t , + 0.01 A 12 LPCM t (5.39) 

(3.08) (0.13) (0.03) (0.06) (0.006) 

-0.013A 12 LPCM t _ 1 - 0.04 (i^ - R b *\ + 0.60DCM7306 + 0.71DUM7307 

(0.005) (0.018) V ' (0.24) (0.28) 

-0.90DCM7310 + 1.06DCM7311 - 0.76 DUM7312 - 0.86DUM7402 

(0.24) (0.25) (0.24) (0.24) 

+1.13 DUM7403 + 1.64DCM7408 - 1.72 DUM7409 - 0.72DUM7501 

(0.24) (0.25) (0.28) (0.24) 

+ 0.49 DU M7808 + 0.57DCM7811 

(0.23) (0.28) 

LR test of over-identifying restrictions: y 2 (89) = 95.3354 [0.3037]. 

The model is estimated over the sample 1961:2 1979:8 by FIML. The 89 over¬ 
identifying restrictions imposed by the reduced form implicit in our structure on 
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the unconstrained reduced form are not rejected. The first equation can be inter¬ 
preted as an aggregate demand equation along which the output gap (deviation 
of output form a stochastic trend) depends on lagged change in nominal interest 
rates. The second equation is stylised aggregate supply which determines infla¬ 
tion as a function of past inflation, the commodity price inflation and the output 
gaps. Finally, the third equation is an interest rate reaction function which de¬ 
scribes short-run dynamics around a long-run solution determined by response 
of interest rates to inflation and output. Note that, because of the dynamic spec¬ 
ification, the response of the monetary instruments to fluctuations in the target 
variables is different in the short .run and in the long-run. To illustrate the within 
sample performance of the model we report actual and fitted values in Figure 
5.3. 

5.9.1 Simulating monetary policy 

We are now in the position of simulating monetary policy. We simulate the im¬ 
pact of an hundred basis point exogenous monetary policy shock by computing 
dynamic multipliers. The baseline model is obtained by simulating dynamically 
, for given values of the exogenous variables, the three endogenous variables are 
generated by equations (5.38) , (5.37) , (5.39) over the sample considered for es¬ 
timation. The perturbed solution is obtained by adding an exogenous one-off 
100 basis point shock to equation (5.39) in the first period of the simulation. 
Obviously a one-off hundred basis point shock to the first difference of the pol¬ 
icy rates is a permanent one-hundred basis points shock to the level of policy 
rates. Dynamic multipliers are then computed by adapting the E-Views proce¬ 
dures already discussed in Chapter 4. All computations are available in the file 
LSE.WF1. We report dynamic multipliers in Figure 5.4. 

Dynamic multipliers confirm the stability of the model and reveal a much 
stronger impact of monetary policy on outptut fluctuations than on inflation. 
Note also that the pattern of multipliers is much smoother than the correspond¬ 
ing pattern for the model used to illustrate the Cowles Commission strategy. 
Such smoothness is a consequence of the better dynamic specification of the 
LSE model. 

5.9.2 Model Evaluation 

To complete our comparative evaluation of the LSE and the Cowles Commission 
specifications we have still considered out-of -sample evaluation, where the per¬ 
formance of the Cowles Commission specification was at its worst. We simulate 
the LSE model dynamically over the period 1985:01- 1996:03. In doing so we 
deliberately skip the period of non-borrowed reserves targeting where our speci¬ 
fication is clearly not appropriate. The dynamically simulated series are reported 
with the actual series in Figure 5.5. 

Figure 5.5 shows an improved performance with respect to the Cowles Com¬ 
mission specification but clearly there are problems in the out-of-sample sim¬ 
ulation. The implementation of diagnostic tests guarantees the quality of the 
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Fig. 5.3. Actual and fitted values from the structural model 

within sample results but cannot ensure against structural shifts in parameters: 
congruent models within sample might perform very poorly in out-of-sample 
simulations. 
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Fig. 5.4. Dynamic multipliers 
5.9.3 Testing the Lucas critique 

Our simple model of the monetary transmission mechanism offers an opportunity 
to implement empirically tests for super-exogeneity. 

In our discussion of identification in chapter 3 we have shown that when a 
central bank faces the following intertemporal optimisation problem: 


Minimize E t '^^8 l L t+ i (5.40) 

i =o 

where: 

L = ^ (TT t - 7T*) 2 + Xxf (5.41) 

under the constraints of the following specification for aggregate supply and 
demand in a closed economy: 
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Fig. 5.5. Out-of-sample dynamic simulation 


.T t+ 1 = f3 x x t - f3 r (it - E t ir t+1 - r) + uf +1 (5.42) 


7 r t+ i = 7r t + a x x t + u s t+1 (5.43) 

the optimal interest rate rule can be written as: 
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it = r + tt 


1 + a xl3 r 

ot x j3 r 


{E t 7T t+ i -7T*) + 



A 1 

6ot x k ot x f3 r 


EtXt+i- 


(5.44) 


where /c is a combination of parameters describing the structure of the econ¬ 
omy and the preferences of the central banker. We have then an intertemporal 
optimization framework which offer a feedforward monetary rule, in which output 
gap and inflation are not superexogenous for the estimation of the parameters 
of interest. Contrast this specification with equation (5.39) in our model. Our 
estimated equation is a feedback specification, which does not include explicitly 
expectations and whose parameters are estimated independently from those in 
the aggregate demand and supply schedules. Therefore we have a natural candi¬ 
date to test the validity of the Lucas critique. Tests of feedforward versus feedback 
model are difficult to apply in that we have designed our model to pass diagnos¬ 
tic tests, however note that the reaction function and the aggregate supply and 
demand equation contain a common set of dummies. This is a clear indication 
of common outliers in the three equations which does not refute the hypothesis 
of validity of the feedforward interpretation. The presence of dummies shall also 
impact on the Engle-Hendry superexogeneity tests. This test is applicable by 
exploiting the specification of supply and demand equations to derive proxies 
for the first two moments in the conditional model for these two variables and 
then by adding them to the interest rate reaction function. The impact of the 
dummies on the test is determined by the fact that they capture some portion of 
the variability in the additional regressors on which joint significance is tested. 


5.10 What have we learned? 

In our opinion, the major strengths of the LSE methodology are related to a 
careful diagnosis of the problems of the Cowles Commission approach and to the 
attept of giving ’’scientific dignity” to the specification of dynamic econometric 
models. The concept of cointegration fits naturally in the context of dynamic 
specification of ECM models. Such research strategy is based on a multi-step 
framework: specification of the VAR and its deterministic component, identifi¬ 
cation of the number of cointegrating vectors, identification of the parameters 
in cointegrating vectors, tests on the speed of adjustment with respect to dis- 
equilibria. The results of the final test depend on the outcome of the previous 
stages in the empirical analysis, but the outcome of each step is not so easily 
and uniquely established empirically. The reduction process has been criticised 
by macroeconomist for its tendency to deliver preferred specification ”...a bit 
over-cooked...” and to loosen considerably the link between econometric model 
and economic theory. Consider the following money demand specification, taken 
from Baba, Hendry, Starr[?], as a typical LSE model: 
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A (m — p). = —0.334A4 (m — p). , — 0.156A 2 (m — p) f A — 0.249 (rri — p— -j/a. 
v Fh [0.097] V l ,t - 1 [0.039] v P,t ~ 4 [0.015] V 2 J 

- 0.33 A p - 1.097A 4 p t _i + 0.859V t + 11.68AS , V' t _ 1 

[0.046] [0.132] [0.079] [1.49] 

-1.409AS) - 0.973AA lt - 0.255AA ma t + 0.435A„ sa t 

[0.104] [0.063] [0.049] [0.055] 

+0.395A Ay t + 0.013A + 0.352 + u t 

[0.07] [0.003] [0.02] 

where heteroscedasticity consistent estimators are reported in brackets.The BHS 
specification for U.S. money demand, is estimated on quarterly data covering the 
period 1960-1988. m is the log of Ml; y is the log of real GNP using 1982 as 
base year; p is the log of the deflator; A 2 is the square of the difference operator 

A; Ap = A(1 + A )p t ] A 4 ( m-p) t _ 1 = 0.25 ((m-p) t _ 1 - {m-p) t _ 5 ) ; V t is 
a nine-quarter moving-average of quarterly averages of twelve-month moving 
standard deviations of 20-year bond yields; SV t = max (0, S t ) 4 V t where S is the 
spread between the 20-year Treasury bond yield and the coupon equivalent yield 
on a one-month TBill; AS t = 0.5 (S) + S)_i); AR\ t is a two-quarter moving- 
average of the one-month T-bill yield; R m a,t is the maximum of a passbook 
savings rate, an weighted certificate of deposit rate and a weighted money market 
mutual fund rate; R nsa ,t is the average of weighted NOW and SuperNow rates; 
Ay t = 0.5 ( y t +i/t_i)and D t is a credit control dummy which is -1 in 1980(2), 1 
in 1980(3), and zero everywhere else. BHS report 11 diagnostics, all passed. 

The achievement of data congruency implies some evident cost in terms of 
parsimony of the specification and economic interpretability of the results.equation 
?? also illustrates why the LSE methodology is not easily applied to system of 
equations, even of very limited dimensions. General-to-specific methodology is 
usually applied in single-equation specification (money demand and consumption 
of non-durables functions are the preferred application of the LSE approach), 
applications to system with few variables are reported in the literature but it 
becomes very hard and very rare to apply such methodology when the dimen¬ 
sion of the system exceeds a small dimension (say five equtions). Moreover, we 
have seen that the applicability of the concept of cointegration is very rapidly 
complicated as n increases in n-variate systems. 

Faust and Whiteman[9] note that ??is a much richer specification than the 
one implicitly contained in the standard VAR approach, including moving aver¬ 
ages and moving standard deviations of interest rates. I am somewhat skeptical 
that such specification could be produced by a VAR approach to cointegration, or 
by any VAR analysis. Criticism of the use of this generated variables are mainly 
based on the argument that, by constuction, they capture within- sample fluc¬ 
tuations in the data and their peformance out-of-sample worsens considerably. 
Moreover such transformations, being data instigated, are usually related to the¬ 
ory with some difficulty. Many applied macroeconometricians feel not at ease in 


45) 

t-2 
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using variables which perform well empirically but whose links with theory are 
not so clear. Of course, for the LSE methodology, this is a problem with the pro¬ 
fession rather than with the econometric methodology. In fact, this is probably 
the centre of the debate. 
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THE VAR APPROACH 

6.1 Introduction: why VAR models ? 

The LSE methodology has intepreted the failure of the traditional Cowles Com¬ 
mission approach, heralded by the critiques due to Lucas [36] and Sims [48], as 
the result of the use of mis-specified and ill-identified models. The LSE method¬ 
ology however does not question the potential of macroeconometric modelling 
for simulation and econometric policy evaluation. In fact, at the stage of simu¬ 
lation and policy evaluation, there no difference between the traditional Cowles 
Commission approach and the LSE approach. The LSE solution to the prob¬ 
lems of traditional macroeconometric modelling is concentrated on the stages of 
identification and specification. The importance of estimation is de-emphasized, 
in that congruency of the specification is considered as a much higher priority 
than the choice of the most appropriate estimator. No innovation is proposed at 
the stage of the simulation and policy evaluation: the traditional methods are 
applied, after having tested, tested, and tested. 

The VAR approach shares with the LSE approach the diagnosis of the prob¬ 
lem of Cowles Commission models but also questions the potential of traditional 
macroeconometric modelling for policy simulation and econometric policy eval¬ 
uation. VAR models of the monetary transmission mechanism differ from struc¬ 
tural LSE models as to the purpose of their specification and estimation. In 
the traditional approach the typical question asked within a macroeconomet¬ 
ric framework is “What is the optimal response by the monetary authority to 
movement in macroeconomic variables in order to achieve given targets for the 
same variables?”. The VAR approach recognizes fully the potential of the Lu- 
cas’critique and acknowledges that questions like “How should a central bank 
respond to shocks in macroeconomic variables?” are to be answered within the 
framework of quantitative monetary general equilibrium models of the business 
cycle. So the answer has to be based on a theoretical model rather than on 
an empirical ad-hoc macroeconometric model. Within this framework there is a 
new role for empirical analysis, i.e. to provide evidence on the stylized facts to be 
included in the theoretical model adopted for policy analysis and to decide be¬ 
tween competing general equilibrium monetary models. The operationalization 
of this research programme is very well described in a recent paper by Christiano, 
Eichenbaum and Evans [14]. Three are the relevant steps: 

• monetary policy shocks are identified in actual economies; 
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• the response of relevant economic variables to monetary shocks is then 
described; 

• the same experiment is then performed in the model economies to compare 
actual and model-based responses as an evaluation tool and a selection 
criterion for theoretical models. 

LSE-type structural models and VAR models of the monetary transmission 
mechanism have a common structure which, using the notation of Chapter 3, 
can be represented as follows: 



where Y and M are vectors of macroeconomic (non-policy) variables (e.g. output 
and prices) and variables controlled by the monetary policymaker (e.g. interest 
rates and monetary aggregates containing information on monetary policy ac¬ 
tions) respectively. Matrix A describes the contemporaneous relations among 

fv Y \ 

the variables and C(L) is a matrix finite-order lag polinomial. v = I I is 


a vector of structural disturbances to the non-policy and policy variables; non¬ 
zero off-diagonal elements of B allow some shocks to affect directly more than 
one endogenous variable in the system. The main difference between the two 
approaches lies in the aim for which models are estimated. 

Traditional Cowles Commission structural models are designed to identify the 
impact of policy variables on macroeconomic quantities in order to determine the 
value to be assigned to the monetary instruments (M) to achieve a given target 
for the macroeconomic variables (Y), assuming exogeneity of the policy variables 
in M on the ground that these are the instruments controlled by the policymaker. 
Identification in traditional structural models is obtained without assuming the 
orthogonality of structural disturbances. Dynamic multipliers are used to de¬ 
scribe the impact of monetary policy variables on macroeconomic quantities. In 
the computation of dynamic multipliers the responses of macroeconomic vari¬ 
ables to monetary policy can be obtained without decomposing monetary policy 
into its endogenous and exogenous components, and, in fact, in most traditional 
empirical applications such decomposition is not implemented. 

The assumed exogeneity of the monetary variables in the traditional ap¬ 
proach makes the model invalid for policy analysis if monetary policy reacts 
endogenously to macroeconomic variables. The LSE methodology would recog¬ 
nise the problem of the invalid exogeneity assumption for M, it would then 
proceed to the identification of an alternative enlarged model (presumably such 
identification will be obtained through the imposition on a-priori restrictions on 
the dynamics of the lagged variables). However, the new model would be still 
used for simulation and econometric policy evaluation, whenever the appropri¬ 
ate concept of exogeneity (respectively strong and super) where satisfied by the 
adopted specification. 
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VAR modelling would reject the Cowles Commision identifying restrictions 
as “incredible” for reason not very different from the ones pinned down by the 
LSE approach, however VAR models of the transmission mechanism are not 
estimated to yield advice on the best monetary policy. They are rather esti¬ 
mated to provide empirical evidence on the response of macroeconomic variables 
to monetary policy impulses in order to discriminate between alternative the¬ 
oretical models of the economy. It becomes then crucial to identify monetary 
policy actions using restrictions independent from the competing models of the 
transmission mechanism under empirical investigation, taking into account the 
potential endogeneity of policy instruments. 

In a series of recent papers, Christiano, Eichenbaum and Evans [12], [13] apply 
the VAR approach to derive “stylized facts” on the effect of a contractionary 
policy shock, and conclude that plausible models of the monetary transmission 
mechanism should be consistent at least with the following evidence on price, 
output and interest rates: (i) the aggregate price level initially responds very 
little; (ii) interest rates initially rise, and (in) aggregate output initially falls, 
with a j-shaped response, with a zero long-run effect of the monetary impulse. 
Such evidence leads to the dismissal of traditional real business cycle model, 
which are not compatible with the liquidity effect of monetary policy on interest 
rates, and of the Lucas [35] model of money, in which the effect of monetary 
policy on output depends on price misperceptions. The evidence seems to be more 
in line with alternative intepretations of the monetary transmission mechanism 
based on sticky prices models (Goodfriend and King [26]), limited participation 
models (Christiano and Eichenbaum [11]) or models with indeterminacy-sunspot 
equilibria (Farmer [21]). 

Having stated the objective of VAR models we are now in the position of as¬ 
sessing how identification, estimation and simulation are implemented to analyse 
the monetary transmission mechanism. 

VAR models concentrate on shocks. 

First the relevant shocks are identified, the response of the system to shocks 
is described by analyzing impulse responses (the propagation mechanism of the 
shocks), forecasting error variance decomposition, and historical decomposition. 

6.2 Identification and Estimation 

We have introduced the identification problem for VAR in Chapter 3. Given the 
representation of the general structural model of interest : 


A 


Y t 

M t 


C (L) 


Y t _r 

M t _i 


+ B 


V 


' Y l 

M 


( 6 . 2 ) 


The structural model (6.2) is not directly observable, however a VAR can be 
estimated as the reduced form of the underlying structural model : 
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(6.3) 


where u denotes the VAR residual vector, normally independently distributed 
with full variance-covariance matrix E. The relation between the VAR residuals 
in u and the structural disturbances in u is therefore: 



undoing the partitioning we have 

u t = A 


(6.4) 


from which we can derive the relation between the variance-covariance matrix 
of u t (observed) and the variance-covariance matrix of u t (not observed) as 
follows: 


E (u t u() = A X B E (v t v' t ) B'A 1 

Substituting population moments with sample moments we have: 


= A _1 BIB'A _1 (6.5) 

contains different elements, this is the maximum number of iden¬ 

tifiable parameters in matrices A and B. Therefore, a necessary condition for 
identification is that the maximum number of parameters contained in the two 
matrices is n ( n + 1 ') , such condition makes the number of equations equal to the 
number of unknowns in system 6.5. As usual, for such condition to be also suffi¬ 
cient for identification it also needed that no equations in 6.5 is a linear combina¬ 
tions of any of the other equations in the system 1 . As for traditional models we 
have the three possible cases of under-identification, just-identification and over¬ 
identification. As for traditional models, the validity of over-identifying restric¬ 
tions can be tested via a statistic distributed as a y 2 with a number of degrees of 
freedom equal to the number of over-identifying restrictions Amisano-Giannini 
[1], Once identification has been achieved, the estimation problem is solved by 
applying Generalised Method of Moments estimation. We shall describe this class 
of estimators in the next chapter. 

In practice , identification requires the imposition of some restrictions on the 
parameters of the A and B. This step has been historically implemented in a 
number of different ways, we concentrate on the most widely used strategies in 
the next subsections. 

■^See Amisano-Giannini [1], Hamilton [30], 
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6.2.1 Choleski Decomposition 

In the famous article which introduced VAR methodology to the profession, 
Sims [48] proposed the following identification strategy, based on the Choleski 
decomposition of matrices: 


This is obviously a just-identification scheme, where the identification of 
structural shocks depends on ordering of variables. It corresponds to a recur¬ 
sive economic structure, with the most endogenous variable ordered last. 

6.2.2 Structural models with contemporaneous restrictions 

In this identification scheme some a-priori information is used to impose restric¬ 
tions on the elements of matrices A and B, different from the Choleski ordering. 
If the objective of VAR is provide evidence to choose between competing models, 
the identifying restrictions should be independent from the theoretical predic¬ 
tions of those models. The recent literature on the monetary transmission mech¬ 
anism 2 , offers good examples on how this kind of restrictions can be derived. 
VARs of the monetary transmission mechanism are specified on six variables, 
with the vector of macroeconomic non-policy variables including gross domestic 
product (GDP), the consumer price index ( P ) and the commodity price level 
(Pcm ), the vector of policy variables includes the federal funds rate (FF), the 
quantity of total bank reserves (TR) and the amount of nonborrowed reserves 
(NBR). Given the estimation of the reduced form VAR for the six macro and 
monetary variables, a structural model is identified by: ( i ) assuming orthogonal¬ 
ity of the structural disturbances; ( ii ) imposing that macroeconomic variables do 
not simultaneously react to monetary variables, while the simultaneous feedback 
in the other direction is allowed, and (Hi) imposing restrictions on the mone¬ 
tary block of the model reflecting the operational procedures implemented by 
the monetary policy maker. All identifying restrictions satisfy the criterion of 
independence from specific theoretical models, in fact ,within the class of models 
estimated on monthly data, restrictions (ii) are consistent with a wide spec¬ 
trum of alternative theoretical structures and imply a minimal assumption on 
the lag of the impact of monetary policy on macroeconomic variables, whereas 
restrictions (Hi) are based on institutional analysis. 

Restrictions (ii) are made operational by setting to zero an appropriate block 
of elements of the A matrix. 

2 See Strongin [57], Bernanke-Mihov [5], Christiano, Eichenbaum and Evans [12], Leeper, 
Sims and Zha [33] on this point. 
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The contemporaneous relations among the Fed funds rate and the reserve 
aggregates are derived, as in Bernanke and Mihov [5], from a specific model of 
the reserve market: 


u TR = —au FF ■ 


,BR 


= f3u FF + v- 

= 4> D V D +4)Bly B + ly S 
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,B 


,D 


(6.7) 

( 6 . 8 ) 

(6.9) 


Equation (6.7) and (6.8) describe banks’ demand equations (expressed in inno¬ 
vation -i.e. VAR residual- form) for total, TR , and borrowed reserves, BR (time 
subscripts are omitted): the federal funds rate affects negatively the demand for 
total reserves (6.7) and positively the demand for borrowed reserves 3 . v D and 
v B are disturbances to total and borrowed reserves respectively. The supply of 
nonborrowed reserves in (6.9) reflects the behaviour of the Federal Reserve. In 
particular, by means of open-market operations, the Fed can change the amount 
of NBR supplied to the banking system in response to (readily observed) dis¬ 
turbances to total and borrowed reserve demand. Moreover, variations in non¬ 
borrowed reserves may be due to monetary policy shocks unrelated to reserve 
demand behaviour. In (6.9) the coefficients <ft D and <f> B measure the reaction of 
the Fed to total and borrowed reserve demand movements respectively, and v s 
represents the monetary policy shock to be empirically identified. The market 
for reserves featuring the assumed simultaneous relations is described by the 
following figure : 

Combining the market for reserves with the macroeconomic variables, we can 
explicitly rewrite (6.4) as follows: 
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( 6 . 10 ) 


Several features of (6.10) must be noted. First, VAR residuals from the first 
three equations, describing the non-policy part of the system, are orthogonalized 
simply by assuming a recursive (Choleski) structure for the corresponding block 
of the A matrix. This procedure yields orthogonal disturbances to which we do 
not attach a specific “structural” interpretation, labelling them simply as vf F 
(i = 1, 2, 3), where NP denotes a non-policy shock. 


3 We assume from the start that movements in the discount rate, which would enter (6.8) 
with a negative sign, are completely anticipated, so that the innovation in the Fed funds- 
discount rate differential is entirely attributable to the former rate. 
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Fig. 6.1. The U.S. market for bank reserves 


Second, as shown by Bernanke and Mihov [5], the general formulation in 
(6.10), is still not identified, but identification can be completed by a careful 
analysis of the operational procedures followed by the Central Bank. 

• Case 1: Federal funds targeting 

In this case we have tf d = 1, ip b = — 1. Central Banks uses NBR to neutralize 
shocks coming from banks and households behaviour . We then have for the 
monetary block identification: 
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The model is now over-identified . Choleski plus additional restrictions 
• Case II: targeting NBR. 
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NBR is now informative on monetary policy shocks 
• Case III: Strongin identification (1994) 
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Shocks to reserves are demand shocks which the Central Bank has to acco¬ 
modate. Therefore monetary policy shocks are the shocks to NBR orthogonal 
to shocks to TR. Moreover Central Bank does not react to Borrowed Reserves. 
a = 0, ip b = 0 
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NBR is now informative 
• Case IV: controlling Borrowed Reserves 


In this case TR-NBR is only function of shocks V s . (f d = l,ip“ = 
case we have: 
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It is easily seen that alternative regime would imply identification which 
are technically not far from Choleski’s triangularization, with different ordering 
of the monetary variables. In a Fed Fund targeting regime FF does not react 
contempeoraneously to the other monetary variables while in a Non-Borrowed 
Reserves targeting regime it is NBR that does not react contemporaneously to 
the other two monetary shocks. Moreover, information on the operating proce¬ 
dure by the FED are important in determining the appropriate identification 
scheme and, more importantly, VAR models of the MTM should be estimated 
within a single policy regime. Bagliano and Favero [2], provide evidence on the 
structural instability of VAR of the MTM estimated across different monetary 
policy regimes. 


6.2.3 Structural model with long-run restrictions 

Often long-run behaviour of shocks provide restrictions acceptable within a wide 
range of theoretical model. A typical restriction compatible with virtually all 
macroeconomic models is that in the long-run demand shocks have zero im¬ 
pact on output. Blanchard-Quah [8] show how these restrictions can be used to 
identify VARs. 

The structural model of interest is specified by posing A equal to the identity 
matrix and by not imposing any zero restriction on the B matrix. We then have 
for a generic vector of variables y t the following specification: 


p 

y t = + Bv t 

i= 1 

from which it is possible to derive the matrix which describes the long-run 
effect of the structural shocks on the variables of interest as follows: 
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Bv t = -n _ 1 Bv t 

Coefficients in II are obtained from the reduced form, therefore we are able 
to impose long-run restrictions given the estimation of the reduced form. 

Two points are worth noting: 

• (/ — A\) is -II, for this matrix to be invertible the VAR must be specified 
on stationary variables; 

• the long-run restrictions are restrictions on the cumulative impulse re¬ 
sponse function. 

Let us now consider the Blanchard-Quah [8] data-set. The authors aim at 
separating demand shocks from supply shocks, they consider a VAR on two 
variables, the unemployment rate, UN, and the quarterly rate of growth of GDP, 
ALY. The original sample contains quarterly data from 1951:2 to 1987:4, we 
have retrieved the two series from Datastream and they are available only for 
the sample 1951:2-1987:4. The series are available in the file BQ.WKS. The VAR 
is specified with 8 lags a constant and a deterministic trend (in the original paper 
a break in the constant for ALY is also allowed but we do not allow it) as follows. 


ALY A-a ( ALY t -iV 4 MLY t _ 8 \ fl 
UN t ) Al 1 UN t -! ) + '" As l UN t -s / + V TREND 


u It 
U2t 


The structure of interest is the following: 


ALY A=A f ALYt -A+ A fALY t - 8 \ fl 
UN t ) l UN t -i I +- As ( UNts 1+^9 I TREND 


bll f Vu 
&21 b 2 2 ) V v, 2t 


To obtain the identifying restrictions consider that 

/ ku fci 2 \ f bn bi2 \ / v lt \ 

\^21 k 2 2 J \&21 b 2 2 J \V 2 t j 

demand shocks are identified by imposing that their long-run impact on the level 
of output is zero: 


(OTYA 
\UN t J 
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kubn + /C12&21 — 0 

Note that by imposing the restriction that the cumulative impulse response of 
the rate of outptut growth to a demand shock is zero we impose the restriction 
that the impulse response of the level of outptut to a demand shock is zero in 
the long-run. As the variables are stationary the long-run response of ALY and 
UN to all shocks is zero by definition. 

We implement the procedure on the data by using MALCOLM [41], a package 
written for RATS. 

We from the estimation of the VAR, we then implement Johansen on this 
VAR, we make sure that the null of stationarity is not rejected. We then retrieve 


the II matrix. II 

'0.1451 

-0.5741 

0.2168 

-0.0693 

, then we specify: 


'-0.1451 

-0.2168' 

-1 

' .60572 1.8949 ' 

' = 

0.5741 

0.0693 

— 

-5.0179 -1.2683 


and our long-run identifying restriction is 


.60572& n + 1.8949&21 = 0 

Note the difference between this methodology and the Cholesky decomposi¬ 
tion, which would simply restrict 621 to zero. 

6.2.4 Identification in cointegrated VARs 

Let us consider now how the identification problem changes when we have a 
cointegrated VAR. Considering, for simplicity, only first order dynamics, the 
cointegrated reduced form is : 


Ay t = ny^ + v t 

where II = oc/3'. As we know, identification of the cointegrating vectors is a 
problem totally separated from identification of the structural shocks of interest. 
Therefore, having solved the identification of the cointegrating relationships, we 
have still to deal with the problem of posing appropriate restrictions on the 
parameters of the B matrix in order to pin down the shocks u t 


Ay t = ny t _ x +Bu t . 

In the context of cointegration, the identification problem can be solved 
in a very natural way. Consider, for simplicity, the case of a bivariate model 
y t = (y t ,x t ) , in which variables are non stationary 1 ( 1 ) but cointegrated with 
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cointegrating vector (1,-1) , so the rank of the II matrix is 1 and we use the 
following representation of the stationary reduced form: 



Model (6.11) can be re-written as follows[40] : 

/-I l\ (l — L 0\ / {y t -x t )\ / an 0\ ( (y t - r - a; t _i)\ ( bn b 12 \ f v lt 
V° ly VO 1 J yAa; t J l^r 0y ^ Ax t _ 1 J V& 2 r b 22 ) \v 2t 

( 6 . 12 ) 

The two representation are absolute identical (same residuals). The second 
representation has been widely use in research based on present value models. 
The cointegrating properties of the system suggests the presence of two types 
of shocks: a permanent one (to be related to the single common trend shared 
by the two variables) and a transitory one (to be related to the cointegrating 
relation). It seems therefore natural to identify one shock as permanent the other 
as transitory. Given that we have a stationary system, the identification of shocks 
is obtained by deriving long-run responses of the variables of interest to relevant 
shocks. From (6.12) we have: 

(/-l l\ A ~L 0 \ _ / a-nL 0 \\ / (y t - x t )\ f bn b 12 \ f v lt \ 

VVO 1^0 l) {a 21 LOJJ{Ax t J \b 21 b 22 )\v 2t ) 

(6.13) 

from which long-run responses are obtained by setting L = 1 and by inverting 
the matrix pre-multiplying variables in the stationary representation of VAR 


{yt - x t ) 

Ax t 


an 1\ (bn bi 2 \ / v lt 
a 2 \ 1 / \b 2 i b 22 J lv 2t 


(6.14) 


({yt - x t ) 

\Ax t 


— b-\ i +621 
aii —ct 2 i 

— a2i oi i +ai i Q2i 
an —a 2 i 


_ frl2~£>22 

an —a 2 i 
— a2ioi2 + QMiQ22 
an —a 2 i 


Vlt 

V2t 


(6.15) 


so v 2 1 can be identified as the transitory shock by imposing the following 
restriction: 


—a 2 \bi 2 + aii&22 = 0 

which, given knowledge of the oc parameters from cointegration analysis, 
provides the just-identifying restriction for the parameters in B. Interestingly, 
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there is one case in which this identification is equivalent to the Choleski ordering, 
the case in which an = 0. Note that this is the case in which A y t is weakly 
exogenous for the estimation of 621 - An application of this identifying scheme is 
provided in Favero, Giavazzi, Spaventa [24] where the procedure is implemented 
to separate international from local factors in the determination of interest rates 
fluctuations. 

6.3 Why shocks ? 

Having identified the “monetary rule” by proposing an explicit solution to the 
problem of the endogeneity of money, the VAR approach concentrates on devi¬ 
ations from the rule. Deviations from the rule can be obtained either by chang¬ 
ing the systematic component of monetary policy or by considering exogenous 
shocks, which leave systematic monetary policy unaltered. In the former case 
the deviation from the rule is obtained by changing some parameters in the A 
matrix describing the simultaneous relations among variables, while in the latte 
case the parameters in the matrices A and B are not altered. Consider for ex¬ 
ample the case of federal fund targeting. The first type of deviations is obtained 
by modifying the response of the federal fund targeting to macroeconomic con¬ 
ditions, i.e. fluctuations in output, commodity prices and the consumer price 
index, while the second type of deviations is obtained by considering an exoge¬ 
nous shock which does not alter the response of the monetary policy maker to 
macroeconomic conditions. 

VAR modellers have exclusively concentrated on simulating shocks, leaving 
the sytematic component of monetary policy unaltered. 

The VAR approach to the monetary transmission mechanism has been crit¬ 
icised on the basis that it views central banks as “random number generators”. 
This does not seem to be correct: in fact, monetary policy rules are explicitly 
estimated in structural VAR models. However, the focus is not on rules but on 
deviations from rules, since only when central banks deviate from their rules it 
becomes possible to collect interesting information on the response of macroe¬ 
conomic variables to monetary policy impulses, to be compared with the pre¬ 
dictions of the alternative theoretical models. In fact, deviations from monetary 
policy rules provide researchers with the best opportunity to detect the response 
of macroeconomic variables to monetary impulses that are not expected by the 
market. The first chain of most models of the monetary transmission mechanism 
links the policy rates to the term structure of the interest rates and the most 
popular model of the term structure, the expectational model, predicts that the 
term structure does not generally react to expected monetary impulses. The 
monetary impulses relevant to the transmission analysis are therefore structural 
shocks in ( 6 . 2 ). 

Recently, McCallum [38] has criticised the choice of VAR modellers of con¬ 
centrating on shocks which leave the systematic component of monetary policy 
unaltered. It is argued that the emphasis on the shock component is misplaced 
because the unsystematic portion of policy-instrument variability is very small 
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in relation to the variability of the sytematic component. Indeed, it is con¬ 
ceivable that the policy behaviour could be virtually devoid of any unsystematic 
component. In the limit, that is, the variance of the shock component could ap¬ 
proach zero. But this would not imply that monetary policy is unimportant for 
price level behaviour, central bank’s main responsibility....” ([38], p. 5). 

The simulation of systematic monetary policy requires, for robustness to the 
Lucas’critique, the specification of a forward-looking model in which “deep pa¬ 
rameters” are identified independently from nuisance parameters describing ex¬ 
pectations formation and dependent on the policy regime. This is what McCal- 
lum effectively does in a series of papers [38], [39] where the impact of modifica¬ 
tions in the monetary policy maker reaction function is dynamically simulated. 

However, it is important to note that McCallum work is not aimed at model 
selection but rather at model simulation. The question of using the empirical evi¬ 
dence to judge between different theoretical model is not addressed in McCallum 
work, based on a specific model. 

If VAR models are instead used to describe the empirical evidence relevant 
to the choice between alternative theoretical models, then there is a possible 
defense of the choice of concentrating on shocks rather than on the systematic 
components of monetay policy. Such defense is related to the Lucas’ critique. 

Consider the following Data Generating Process: 


Vt = aim( +1 + a 2 y t -i + u lt 

m t = b 0 + b 1 y t _ 1 + b 2 m t _ 1 + u 2t 


where y is the macroeconomic variable and m is the monetary policy variable. 
The DGP is the relevant theoretical model, which is unknown to the empir¬ 
ical researcher. The empirical researcher tries instead to describe the empirical 
relation between the monetary instruments and the macroeconomic variables by 
specifying the following structural VAR: 


yt = c 0 + cim t + c 2 yt-i+v u (6.16) 

m t = b 0 + b\y t ~ \ + b 2 m t -i + v 2t 

where the following restrictions hold: c a = Qq&o, <Y = a>\b 2 , c 2 = a 2 + a\b\. 

6.16 is not viable for econometric evaluation of sytematic monetary policy, 
in that the parameters in the equation for y cannot be kept constant when 
the sytematic component of the monetary policy rule is altered. However, the 
simulation of the dynamic impact of a monetary policy shocks identified a-la- 
Choleski ordering m first is still viable in that it is perfomed while keeping all 
parameters constant. 

Note that this small example reiterates the importance of estimating param¬ 
eters in the Structural VAR models by concentranting on a single policy regime, 
in fact regime shifts require different parameterizations. 
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6.4 Description of VAR models 

After the identification of structural shocks of interest, the properties of VAR 
models are described using impulse response analysis, variance decomposition 
and historical decomposition. 

Consider a strucutral VAR model for a generic vector y t , containing m 
variables: 


p 

A 0 y t = ^Aiyt-i + Bv ( 
1 = 1 


which we can rewrite as : 


[A 0 - A (L)] y t = Bv ( 

P 

A (L) = 

i= 1 

now by inverting [Ao — A (L)] (under the assumption of invertibility of this 
polynomial) we obtain the moving average representation for our VAR process: 


yt = C (L)v t (6.17) 

y t = Cov t + CiVt-i +... + C s v t _ s 
C (L) = [Ao-A(L)]- 1 
C 0 = A 0 X B 

To illustrate the concept of an impulse response function, we interpret the 
generic matrix C s within the moving average representation as follows: 


C s 


dyt+s 

dv t 


in other word the generic element {i,j} of matrix C s represent the impact of 
a shock hitting the j -th variable of the system at time t on the i -th variable of 
the system at time t + s. As s varies we have a function describing the response 
of variable i to an impulse in variable j. For this function of partial derivative to 
be meaningful we must allow that a shocks in variable j occurs while all other 
shocks are kept to zero. Of course this is allowed for structural shocks, as they 
are identified by imposing they are orthogonal to each other. Note howeve that 
the concept of an impulse response function is not applicable to reduced form 
VAR innovations, which, in general, are correlated to each other. 

Hystorical decomposition is obtained by using the structural MA representa¬ 
tion to separate series in the components(orthogonal to each other) attributable 
to the different structural shocks. 
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Finally Forecasting Error Variance Decomposition ( FEVD ) are obtained 
from (6.17) by deriving the error in forecasting y s period in the future as: 

(y t+s — E t yt+s) = CoVt + C]Vt-i + ... + C s v t _ s 
from which we can construct the variance of such forecasting error as: 


Var (y t+s - E t y t+S ) = C 0 /C' + + ... + C S IC' S 

from which we can compute the share of the total variance attributable to 
the variance of each structural shocks. Note again that such composition make 
sense only if shocks are orthogonal to each other. In fact it is only in this case 
that we can write the variance of the total forecasting error as a sum of variances 
of the single shocks (as the covariance terms are zero following the orthogonality 
property of structural shocks). 

To illustrate the three concepts consider the following bivariate VAR, in which 
structural parameters have been identified and estimated via a Choleski decom¬ 
position: 


( yit ) = ( 0,110,12 ) ( yu- 1 ) + f bii ° ] f Vu ) 

\V21 ) \ a 21 a 22 ) \V2t-l ) \&21 &22 ) \ v 2t ) 

the MA representation is 

( Vlt \ = ( bl1 ° + ( ai1 fll2 ^ ( bl1 ° 

\V 2 t j V&21 b 2 2 j \v 2t j \fl21 0,22 j \&21 &22 ) \«2t-l ) 

+ + fon ai 2 V All 0 \ / 

\ fl 21 a 22 / V&21 b 2 2 J \V 2t - s J 


From which impulse response functions, historical decomposition and Fore¬ 
casting Error Variance Decomposition are immediately obtained. 


6.5 Monetary policy in closed economies 

Cumulative work on the analysis of the monetary transmission mechanism in 
the U.S. led to the specification of a VAR system which has by now become the 
standard reference model. We have already seen and discussed this benchmark 
specification which contains six variables: gross domestic product (GDP), the 
consumer price index ( P ) and the commodity price level ( Pcm ) together with 
the federal funds rate ( FF ), the quantity of total bank reserves (TR) and the 
amount of nonborrowed reserves ( NBR ). 

It is interesting to see how the specification of the benchmark model has 
developed over-time. 
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Initially models were estimated on rather limited set of variables, i.e. prices, 
money and output, and identified imposing a diagonal form to the matrix B and 
a lower triangular form to the matrix A with money coming last in the ordering 
of the variables included in the VAR (Choleski identification). This first type of 
models are discussed in Leeper, Sims e Zha [33], we replicate their results on the 
data-set LSZUSA.WF1. The underlying structural model is specified as follows: 


k 

A 0 y t = ^2 A^y t -i + Bu> t 

i= 1 



Pt 


'1 0 O' 

y t = 

Vt 

, Aq — 

CL 21 1 0 




0-31 O 32 1 


B = 


bn 0 0 

0 622 0 

0 0 633 


v t ~ N.I.D. (0, 1). 


(6.18) 


All variables are expressed in logs. Identification is Choleski with money 
ordered last. This is a model geared to deliver monetary policy shocks, so the 
identification of shocks to LP and LY does not matter. As [33], we have estimated 
the model on the sample 1960:1-1996:3, including six lags of each variable. The 
following impulse responses are obtained: 

Prices slowly react to monetary policy, output responds in the short-run, 
in the long-run (from two years after the shock onwards) price start adjusting 
and the significant effect on output vanishes. There is no strong evidence for 
the endogeneity of money. This is easily checked by looking at the estimated 
parameters in Ao and by analysing FEVD in Figure 6.3. 

Macroeconomic variables play a very limited role in explaining the variance 
of the forecasting error of money, while money plays instead an important role 
in explaining fluctuations of both the macroeconomic variables. 

Sims [48] extended tha VAR to include the interest rate on Federal Funds 
ordered just before money as a penultimate variables in the Choleski identifi¬ 
cation. The idea is to see the robustness of the above results after identifying 
the part of money which is endogenously to the interest rate. Impulse response 
functions are modified as follows: 

while FEVD is modified as follows: 

Impulse response function and FEVD raise a number of issues: 

• though little of the variation in money is predictable from past output and 
prices, a considerable amount becomes predictable when past short-term 
interest rates are included in the information set; 

• it is difficult to interpret the behaviour of money as driven by money supply 
shocks. The response to money innovations gives rise to the “liquidity puz¬ 
zle” : the interest rate declines very slightly contemporaneously in response 
to a money shock to start increasing afterwards; 
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Response to One S.D. Innovations ± 2 S.E. 

Response of LP to LP Response of LP to LY Response of LP to LM2 



Response of LY to LP Response of LY to LY Response of LY to LM2 



Response of LM2 to LP 


Response of LM2 to LY 


Response of LM2 to LM2 



Fig. 6.2. Impulse Response functions in a three-variables VAR of the MTM 


• there are difficulties also with interpreting shocks to interest rates as mone¬ 
tary policy shocks. The response of prices to an innovation in interest rates 
gives rise to the “price puzzle”: prices increase significantly after an inter¬ 
est rate hike. An accepted intepretation of the liquidity puzzle relies on 
the argument that the money stock is dominated by demand rather than 
by supply shock. Moreover the interpretation of money as demand shocks 
driven is consistent with the impulse response of money to interest rates. 
Note also that, even if the money stock were to be dominated by supply 
shocks, it would be reflecting both the behaviour of central banks and of 
the banking system. For both these reasons the broad monetary aggregate 
has been substituted by narrower aggregates, bank reserves, on which is 
easier to identify shocks mainly driven by the behaviour of the monetary 
policy maker. The “price puzzle” has been attributed to misspecification of 
the four-variables VAR used by Sims. Suppose that there exists a leading 
indicator for inflation to which the FED reacts. If such a leading indica¬ 
tor is omitted from the VAR, we have then an omitted variable positively 
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Variance Decomposition 

Percent LP variance due to LP Percent LP variance due to LY Percent LP variance due to LM2 



Fig. 6.3. FEVD in a three-variables VAR of the MTM 

correlated with inflation and interest rates which makes the VAR misspec- 
ified and explains the positive relation between prices and interest rates 
observed in the impulse response functions. It has been observed 4 that the 
inclusion of a Commodity Price Index in the VAR solves the “price puz¬ 
zle” . Our brief historical record of the empirical analysis of closed economy 
VAR of the MTM has brought us to the justification of the six-variables 
included in what is by now known as the benchmark VAR model. We have 
already discussed its identification, let us now examine impulse response 
function derive by using the FED fund targeting identifying restrictions 
and reported in Figure 6.6 

The evidence reported in the IRF represents the relevant fact to be included 
in theoretical models of the MTM. It is this kind of evidence that has established 
that plausible models of the monetary transmission mechanism should be consis¬ 
tent at least with the following evidence on price, output and interest rates: (i) 
the aggregate price level initially responds very little; (ii) interest rates initially 

4 See Christiano, Eichenbaum and Evans [12], Sims [53], 
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Response to One S.D. Innovations ± 2 S.E. 

Response of LP to LP Response of LP to LY Response of LP to TBILL3 Response of LP to LM2 



Response of LY to LP Response of LY to LY Response of LY to TBILL3 Response of LY to LM2 



Response of TBILL3 to LP Response of TBILL3 to LY Response of TBILL3 to TBILL3 Response of TBILL3 to LM2 



Response of LM2 to LP Response of LM2 to LY Response of LM2 to TBILL3 Response of LM2 to LM2 



Fig. 6.4. Impulse responses in a four-variables VAR of the MTM 

rise, and (Hi) aggregate output initially falls, with a y-shaped response, with a 
zero long-run effect of the monetary impulse. 

6.6 Monetary policy in open economies 

Various papers have examined the effects of monetary shocks in open economies, 
but this strand of literature has been distinctly less successful in providing ac¬ 
cepted empirical evidence than the VAR approach in closed economies. 

The first results have been provided by Eichenbaum and Evans [20], using an 
open-economy VAR with the following structure: 

k 

A 0 y t = ^Aiyt-i + Bv t (6.19) 

1 = 1 

where y t = [V t [rs P RS NBRX RS (FF t ) Y t FOR P t FOR R for e t (q t )]'. 

Y US ,P US are logs of US output and price, NBRX US is the ratio of non- 
borrowed to total reserves (the appropriate variable from which extract mone¬ 
tary policy shocks under a regime of non-borrowed reserves targeting). FF is 
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Variance Decomposition 

Percent LP variance due to LP Percent LP variance due to LY Percent LP variance due to TBILL3 Percent LP variance due to LM2 




Percent TBILL3 variance due to LP Percent TBILL3 variance due to LY Percent TBILL3 variance due to TBILL3 Percent TBILL3 variance due to LM2 




Fig. 6.5. FEVD in a four-variable VAR of the MTM 


the Federal Funds rate, which is considered in alternative to NBRX US , and it 
is the informative variable for the extraction of monetary policy shocks under 
the regime of interest rate targeting; y fcr ,P fcr , and R for are respectively 
the logs of output, prices, and the level of short-term interest rate in the foreign 
country; e is the nominal bilateral exchange rate, while q is the real bilateral ex¬ 
change rate. The matrix B is diagonal and A is lower-triangular. The empirical 
analysis is implemented by considering in turn as a foreign country each of the 
G7 countries on a sample of monthly data from 1974:1 to 1990:5. The following 
evidence emerges: (i) a restrictive US monetary policy shock generates a signifi¬ 
cant and persistent appreciation of the US dollar; (it) a restrictive US monetary 
policy shock generates a significant and persistently larger effect on the domestic 
interest rate with respect to the foreign rate; ( i ) and ( it ) imply a sharp deviation 
from the uncovered interest parity condition in favour of US dollar-denominated 
investments (the “forward-discount puzzle”); (Hi) identified US monetary policy 
shocks are not different from the shocks derived within closed-economy VARs 
(iv) the closed-economy response of US prices and output to monetary policy 
shocks is robust to the extension of the VAR to the open economy; (v) a restric- 
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Response to One S.D. Innovations + 2 S.E. 


Response of LPCM to USFF 



Response of LY to USFF 



Response of LP to USFF 



Response of SMNBR to USFF 



Fig. 6.6. Impulse response functions in a six variables VAR of the MTM 


tive foreign monetary policy shock generates an appreciation of the US dollar 
(the “exchange-rate puzzle”); and (vi) the response of the real exchange rate to 
the US and foreign monetary policy shocks does not differ significantly from the 
response of the nominal exchange rate. Such evidence is substantially confirmed 
by the the work of Schlagenhauf and Wrase [47], who consider a very similar 
specification for the G5 countries over the sample 1972:2-1990:2, using quarterly 
data. 

Some considerations are in order to help the interpretation of the above 
results. 

First, the empirical models are estimated over samples including shifts in 
US and foreign monetary policy regimes: therefore, parameter instability is a 
potential problem. 

Second, the extension to the open economy features the omission from the 
VAR of the commodity price index and of the monetary variables not relevant 
to the extraction of the policy shocks. While the simplification of the monetary 
block is sustainable in the light of the absence of contemporaneous feedback 
between the informative variables and the other monetary variables under the 
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chosen identification schemes, the omission of the commodity price index is not 
justifiable as it leads to the same misspecification as in the closed economy model 
for US monetary policy shocks. Moreover, such omission might well also bias 
the identification of the foreign monetary policy shocks if the commodity price 
index is regarded as a leading indicator of inflation by the foreign policymaker. 
Therefore, it could be argued that the observed puzzles might depend on the 
incorrect specification of the VAR generated by the omission of the commodity 
price index. 

Third, on the identification scheme. While some rationale can be provided 
for a quasi-recursive scheme in closed economies, a similar justification is much 
harder to provide in open economies. In fact, the recursive identification scheme 
with the exchange rate ordered last implies that neither the US nor the foreign 
monetary authority react contemporaneously to exchange rate fluctuations. This 
assumption seems to be sustainable for the US (the FED benign neglect for the 
dollar) but it is certainly heavily questionable when the foreign countries are 
considered, as they are much more open economies than the US. The failure of 
the recursive identification scheme could also be responsible for the observation 
of the puzzles. In fact, most of the recent empirical work is aimed at breaking 
such recursive structure in the identification scheme. 

Kim and Roubini [32] obtain such aim by introducing a structural identifi¬ 
cation by the explicit consideration of a money demand and supply functions. 
Their specification can be described as follows for the generic non-US country: 
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with R for denoting the short-term non-US policy rate, M FOR a monetary 
aggregate (MO or Ml), p FOR the log of consumer price index, Y FOR the log 
of industrial production, OPW the world index of oil price in dollars, FF the 
Federal Funds rate, and e the nominal exchange rate against the dollar. B is 
a diagonal matrix. The model is estimated over the sample 1974:7-1992:5, on 
monthly data. The main differences between the proposed strucutral identifi¬ 
cation and the recursive identification scheme can be understood by analysing 
the Ao matrix. Some elements under the principal diagonal are set to zero to 
allow the introduction of simultaneous feedbacks between demand and supply 
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for money and central bank behaviour and exchange rates. The estimated model 
is over-identified in that 23 parameters are estimated in the Ao and B matrix, 
out of a possible maximum of 28. The over-identifying restrictions are tested 
and not rejected. The identifying restrictions are rather standard.US economy is 
taken as exogenous and the exchange rate does not enter in the FED reaction 
function, US output and prices are not included in the VAR, while a simulta¬ 
neous feedback is allowed between money demand and supply (the central bank 
rule). According to this rule, contemporaneous US interest rate movements are 
relevant to the foreign central bank only if they affect the exchange rate. Only 
the exchange rate is allowed to contemporaneously react to news in all the other 
variables. 

Unfortunately the coefficients in the Ao matrix are estimated rather impre¬ 
cisely. If we consider the case US-Germany the only significant parameters in the 
matrix are <253 and a 72 . The first parameter is difficult to intepret, given that the 
identification scheme does not address explicitly macroeconomic shocks, while 
the point estimate of second parameter implies an appreciation of the dollar 
against the D-mark in response to a US restrictive monetary policy. The poten¬ 
tial simultaneous feedback between foreign monetary policy and the exchange 
rate does not seem to be empirically relevant. However, all the puzzles disap¬ 
pears and the empirical results for the impulse response functions seem to be 
broadly in line with results from the US closed economy model. Given that this 
VAR included some proxy for commodity price index the evidence cannot be 
decisive on the source of the “puzzles”, although the fact that the simultaneous 
feedback between foreign interest rates and the exchange rate is not significant 
is consistent with attributing a substantial role to the omisson of commodity 
prices. 

Also in this case the sample considered spans different regime, moreover this 
methodology brings back into the specification broad monetary aggregates. In¬ 
terestingly money is now used to extract demand rather than supply shocks, 
however the specification of money demand implicit in the VAR might not be 
rich enough to capture the dynamic in the data. As pointed out by Faust and 
Whiteman [23], single equation work by Hendry and colleagues on money demand 
has clearly shown the importance of including in the model the opportunity cost 
of holding money, which is often a spread between the interest rates. Interest 
spreads capturing the opportunity cost of holding money are never included in 
VAR models of the MTM. An identification similar to the one adopted by Kim- 
Roubini is the one proposed for the Canadian case by Cushman-Zha [18], who 
aid the strucutral identification by introducing explicitly the trade sector into 
the model. 

An interesting alternative approach to the identification of the simultaneous 
feedback between non US interest rates and exchange rates is proposed by Smets 
[54], [55], Smets considers the following structural model for non-US countries: 
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where A y t is output growth, Ap t is inflation, R is a short term interest rate 
and Ae t is the exchange rate appreciation. No US variables are introduced, and 
the commodity price index is also excluded. However Smets is more ambitious 
than average aiming at identifying both macroeconomis and monetary shocks. 
Three type of restrictions are imposed. First the semi-structural restrictions, 
macro variables do no react contemporaneously to monetary variables. Second, 
macroeconomic supply shocks are identified for macroeconomic demand shocks 
by assuming that the long-run effect of demand shocks on outptut is zero. Third, 
monetary policy shocks are identified from exchange rate shocks by assuming that 
the Central Bank reacts proportionally to interest rate and exchange-rate devel¬ 
opments (short-term MCI). Macroeconomic shocks are separated into demand 
and supply shocks by noting that the long-run response of output to a demand 


shock is given by the element (1,1) of the matrix 
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the Ao matrix and therefore from the estimation of the reduced-form VAR one 
can retrieve all the elements in the Ai matrix a generate an identifying restric¬ 
tion for the structural parameters in the B matrix by setting the element ( 1 , 1 ) 
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it can be easily shown that /cii,/C12, ^13, ^14 are determined independently form 
the parameters in Ao, therefore restricting to zero the long-run effect of demand 

shock on output,/Cii6ii+/ci2&2i+fci3&3i+fci4&4i, we have 621 = A fcll6ll+fc u^ 31+fcl4641 -) _ 

Lastly in the monetary block, monetary policy shock are identified from ex¬ 
change rate shocks by assuming that the appropriate indicator of exogenous 
monetary stance is a short-term MCI where exchange rate and interest rate 
are appropriately weighted. The weights can be estimated or imposed given the 
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knowledge of the relative weights in several Central Banks MCI’s. This approach 
encompasses as special case the pure interest rate targeting and the pure ex¬ 
change rate targeting. The main empirical problem with this procedure are the 
instability of the estimated lj and the potentially disruptive implications of mis- 
specification for the identification of aggregate demand and supply shocks (see 
Faust and Leeper [22]on this point). 

6.6.1 Replicating the empirical evidence 

The data-set BERLIN.WF1 contains the relevant variables to replicate the dis¬ 
cussed so far on open-economy VAR models. 

We estimate first a benchmark open economy model for the US and the 
German economy without including the Commodity Price Index. The model is 
estimated on monthly data over the sample 1983:1 1997:12. The VAR is specified 
by including six lags of US industrial production, US consumer price index, 
Federal Funds rate, German industrial production,German consumer price index, 
German call money rate, and the US-dollar/Deutschemark nominal exchange 
rate (unit of DM for oneUS dollar). The choice of the sample is motivated by 
the need of having a single monetary policy regime for the US, featuring Fed 
funds targeting, Bagliano-Favero [2]. Impulse responses for all variables to US 
and German monetary policy shocks are reported in the following figures. 
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Fig. 6.7. Impulse responses to US monetary policy shock in the benchmark 

VAR open-economy model 
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Fig. 6 .8. Impulse responses to German monetary policy shock in the bench¬ 
mark VAR open-economy model 
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We have confirmation of all the facts and puzzles observed in the litera¬ 
ture. The analysis of the contemporaneous feedback between variables within 
the recursive specification provides evidence on the endogeneity of US monetary 
policy, which reacts significantly to internal conditions, and of the German mon¬ 
etary policy which reacts to both internal conditions and, less significantly to 
US monetary policy. The exchange rates reacts contemporaneously significantly 
to US monetary policy (positive interest rate shock in the US induces appreci¬ 
ation of the US dollar vis-a-vis the DM) and to macroeconomic conditions in 
US and Germany (a positive shock in US industrial production and in German 
price lead contemporaneously to an appreciation of the US dollar) both it is not 
contemporaneously significantly affected by German monetary policy. 

The analysis of the responses to monetary impulses in the US and Germany 
confirms all the main findings of the literature namely: 

• a significant U-shaped response of US output to US monetary policy 

• the existence of a price puzzle both for the US and Germany 

• the existence of a forward discount puzzle generated by US monetary policy 
restriction 

• exchange rate puzzle for German monetary policy shock 
6.6.2 Omitted variables 

Our analysis of VAR models of the MTM in close has shown a crucial importance 
of the Commodity Price Index in the derivation of monetary policy shocks. The 
arguments made for the inclusion of this variable in close-economy VAR of the 
MTM are also compelling for open economy VAR. It might be vary well the 
case that puzzles observed in open-economies are related to mis-specification, 
via the omission of a commodity price index in the benchmark open-economy 
VAR. We consider this potential explanation, by concentrating of open economy 
VAR model linking the US and the German economy. 

We then include a commodity price index by keeping the Choleski identifica¬ 
tion and considering Pcm as a macroeconomic variables influencing both US and 
German monetary policy. The new impulse responses are reported in Figures 6.9 
and 6.10. 
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Fig. 6.9. Impulse responses to US monetary policy shock in the benchmark 
VAR with commodities price index 
(dashed lines: 68% interval confidence band) 
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Fig. 6.10: Impulse responses to German monetary policy shock in the bench¬ 
mark VAR with commodities price index (dashed lines : 68% interval confidence 
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The results in the two figures show that the inclusion of commodity price 
solves the price puzzle for both countries, moreover also the forward discount 
bias puzzle and the exchange rate puzzle tend to disappear. Finally, although we 
do not observe a symmetric contemporaneous effect of US and German monetary 
policy on the exchange rate, the impulse response functions of the exchange rate 
to the two monetary shocks over an horizon of four year show a remarkable 
degree of symmetry. 

Altough the inclusion of commodity prices seems very relevant in fixing many 
of the empirical problems in open-economy VAR we have open the issue of po¬ 
tential simultaneity between the exchange rate and the policy rate in small open 
economy and not yet addressed it. We shall consider this issue by looking at the 
more general problem of assessing the reliability of the measurement of monetary 
policy with VAR models. 

6.7 VAR and non-VAR measures of monetary policy. 

Econometric measurement of monetary policy has always been a debated is¬ 
sue. VAR models are linear, constant-parameters autoregressive distributed lag 
models, bound to include a very limited number of variables with a very parsi- 
monous lag parameterization. The crucial step to derive evidence from the data 
using VARs, is the possibility of posing identifying restrictions independently 
from theoretical models. We have illustrated how a consensus has been reached 
in the case of closed economy and how the same result has not yet been reached 
for open economies. We have provided an intepretation of this difference in the 
light of the difficulties in identifying monetary policy shocks in open economies. 
Recently, VAR based monetary policy shocks have been compared with monetary 
policy shocks measured by alternative approaches. We think that these develop¬ 
ments can be useful not only to evaluate VAR methodologies, but also to help 
identification when, as in the case of the open economy models, the traditional 
VAR methods have problems in delivering the necessary number of identifying 
restrictions. 

6.7.1 Non-VAR measures of monetary policy shocks 

Historically alternative to econometric measurement of monetary policy have 
been always considered, think for example of qualitative indicators of monetary 
policy derived adopting the “narrative approach” of Romer and Romer [44]-[45]. 
In a recent paper, Leeper [34] shows that even the dummy variable generated 
by the “narrative approach” (identifying episodes of deliberate monetary con¬ 
tractions) is predictable from past macroeconomic variables, thus reflecting the 
endogenous response of policy to the economy, and the estimated coefficients can¬ 
not provide an unbiased estimate of the response of the macroeconomic variables 
to a monetary impulse. 

Recently the attention of monetary economists has turned to financial mar¬ 
kets, which are a potential source of very powerful information and measurement 
of monetary policy. We shall consider a variety of measures of monetary policy 
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derived from financial market and assess their role in the evaluation of VAR 
based monetary policy shocks in open and close economies. 

A first possible alternative has been originally proposed by Rudebusch [46] 
and further analyzed by Brunner [9]. Monetary policy shocks are derived from 
the 30-Day Fed funds future contracts, which have been quoted on the Chicago 
Board of Trade since October 1988, and are bets on the average overnight fed 
funds rate for the delivery month, the variable included in benchmark VARs. 
Shocks are constructed as the difference between the federal funds rate at month 
t and the 30-day federal funds future at month t — 1. Such choice is based on the 
evidence, that the regression of the federal funds rate at t on the 30-day federal 
funds future at t — 1 produces an intercept not significantly different from zero, 
a slope coefficient not significantly different from one, and serially uncorrelated 
residuals: 


FFt = — 0.037 + 0.999 FFFt-i + Wt 

( 0 . 0436 ) ( 0 . 007 ) 

R 2 = 0.99 a = 0.145 DW = 1.86 

Note that this procedure produces shocks, labelled FFF, which are comparable 
to the reduced form innovations from the VAR and not to the structural mone¬ 
tary policy shocks, because surprises relative to the information available at the 
end of month t — 1 may reflect endogenous policy responses to news about the 
economy that become available in the course of month t. However if an identifica¬ 
tion scheme is available, then innovations derived from the future contract can be 
transformed in the relevant shocks by applying to them the standard VAR iden¬ 
tification procedure. A non-trivial problem with this procedure is generated by 
the fact that Federal Funds future are available on from 1988 onwards, on a more 
extended sample future on the 1-month euro-dollar are available. Given that the 
properties of the series generated by 1-month Euro-dollar are very close to the 
properties of Federal Funds future, the direct measurement based on 1-month 
Eurodollar could be used on a extended sample. 

A second non-VAR measure of policy shocks is based on the work of Skinner 
and Zettelmeyer [53], They derive a measure of unanticipated monetary policy 
shocks by following a two-step methodology: first, using information from cen¬ 
tral bank reports and newspapers they make a list of days on which monetary 
policy announcements occurred is constructed; second, monetary policy shocks 
are identified with the changes in the three-month interest rate on the days of 
policy announcements. The validity of such procedure require that (i) short rates 
(e.g the overnight rate) are affected by policy; (ii) arbitrage is effective between 
the overnight and the three-month interest rate; (Hi) the impact of other news 
affecting the three-month rate on the day of the policy decision is negligible; 
(iv) policy actions are not endogenous responses to information that becomes 
available on the day when the decision is taken. To ensure that conditions (Hi) 
and (iv) are applicable, Skinner and Zettelmeyer go through reports of the policy 
actions and exclude from their sample those which do no conform to requirement 
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(in) and (iv). The main problem with the index so obtained is that it can only 
pin down shocks associated to monetary policy decisions reflected in some action 
on controlled variables, whereas shocks associated with no action (while some 
action was expected by the markets) are neglected. 

An alternative approach which might overcome this problem has been pro¬ 
posed by Bagliano and Favero [2], applying the methodology set out in Svensson 
[58] and Soderlind and Svensson [56], The methodology is based on the use of 
instantaneous forward rates as monetary policy indicators. Forward rates are 
interest rates on investments made at a future date, the settlement date, and 
expiring at a date further into the future, the maturity date. Instantaneous for¬ 
ward interest rates are the limit as the maturity date and the settlement date 
approach one another. 

To illustrate our derivation of spot rate let us start by the consideration of 
a zero-coupon bond issued at time t with a face value of 1, maturity of m years 
and price P^f ■ The simple yield Y mt is related to the price as follows: 

p zc = 1 

mt (1 +Y mt ) m 

Defining the spot rate r mt as log(l +Y mt ), which is the continuously compounded 
yield, and the discount function D mt as the price at time t of a zero coupon that 
pays one unit at time t + m, we then have : 

Pmt = ex P ( ~mr mt ) = D mt (6.25) 

Consider now a coupon bond that pays a coupon rate of c per cent annually and 
pays a face value of 1 at maturity. The price of the bond at trade date is given 
by the following formula: 


(6.24) 


i mi — cDkt + Dmt (6.26) 

k=1 

Given the observation of prices of coupon bonds, spot rates on zero coupon 
equivalent can be derived by fitting a discount function based on the following 
specification for the spot rates: 



Such specification has been originally introduced by Svensson [58] and it is an 
extension of the parametrization proposed by Nelson and Siegel [42], Implied 
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forward rates can be calculated from spot rates. A forward rate at time t with 
trade date t + t' and settlement date t + T can be calculated as the return on 
an investment strategy based on buying zero-coupon bonds at time t maturing 
at time t + T and selling at time t zero-coupon bonds maturing at time t +t'. 
The forward rate is related to the spot rate by the following formula: 


ft+T,t+t’,t 


Tr T ,t — t'r t ’ >t 
T-t' 


(6.28) 


so the forward rate for a 1-year investment with settlement in 2 years and 
maturity in 3 years is equal to three times the 3 year spot rate minus twice 
the two year spot rate. The instantaneous forward rate is the rate on a forward 
contract with an infinitesimal investment after the settlement date: 


fmt = lim 


ft+T, 


t+m,t 


(6.29) 


In practice we identify the instantaneous forward rate with an overnight forward 
rate, a forward rate with maturity one day after the settlement. The relation 
between instantaneous forward rate and spot rate is then: 


T mt — 


t-\-m 


f 

J T=t 


frtdT 


m 


or, equivalently 


fnit = r mt + m m,t (6.30) 

am 

Given specification (6.27) for the spot rate, the resulting forward function is as 
follows: 


k \ k 

fkt = f3 0 +f3 1 exp I-+ f3 2 — exp 
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(6.31) 


Therefore as k goes to zero the spot and the forward rate coincide at (3 0 + f3 1 and 
as k goes to infinity the spot and the forward rate coincide at (3 0 . The forward 
rate function features a constant, an exponential term decreasing when f3 1 is 
positive, and two “hump shape” terms. In principle (3 0 + f3 1 can be restricted to 
match the observed overnight rate, but, we do not follow this strategy. A stan¬ 
dard practice in the application of this curve-fitting approach is to include the 
overnight rate in the information set, sometime constraining the fitted overnight 
rate to match the observed one in estimation. However, a monetary policy shock 
implies by definition a jump in, at least, the short end of the term structure. 
Forcing the smooth instantaneous forward rate curve to fit exactly the observed 
overnight rate would not allow to seize an eventual expected monetary policy 
action. For this reason, we exclude the overnight rate from the information used 
for estimation. Then, exploiting the continuity of the functional form, we re¬ 
construct the very short end of the term structure allowing for a gap between 
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the estimated overnight and the observed overnight. Such a gap represents the 
jump in the very short-end of the term structure associated with expectations 
of intervention by the central bank.An example can clarify matters. On occasion 
of the meeting held on the 2nd of December 1993 the Bundesbank reduced the 
repo rate by 25 basis points. On the close of the markets before the meeting we 
observed the structure of spot rates relevant to the estimation of our yield curves 
reported in Table 1. 


TABLE 1: German Interest rates 


Date 

30/11/93 

2/12/93 

o/n 

6.70 

6.35 

7-days 

6.44 

6.31 

1-month 

6.44 

6.31 

3-month 

6.19 

6.06 

6-month 

5.81 

5.75 

1-year 

5.37 

5.25 

2-year 

5.08 

5.03 

3-year 

5.05 

5.02 

4-year 

5.16 

5.15 

5-year 

5.3 

5.29 

7-year 

5.69 

5.68 

10-year 

6.16 

6.17 


In Figure 6.11 we report Nelson-Siegel interpolants. More precisely we report 
the two instantaneous forward curves associated respectively to the spot curve 
estimated excluding the overnight rate (IFW) and to the spot curve estimated 
including the overnight rate (IFOY). 
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IFOY IFW 


FIG. 6.11 Estimated forward-rate curves on the 30/11/93 with and without 

the overnight rate 


Fitting the curve on data including the overnight without allowing for a 
jump in the term structure from the date of the Bundesbank Council meeting 
afterwards, would have spuriously generated an interest rate shock. 

If the pure expectational model is valid and there is no term premium, then 
instantaneous forward rates at future dates can be interpreted as the expected 
spot interest rates for those future rates. The observable equivalent of the in¬ 
stantaneous forward rate is the overnight rate. 

The following strategy for identification of monetary policy shocks exploits 
directly the relation between spot rates and instantaneous forward rates: 

• exploiting the fact that intervention on policy rates for Germany and US 
takes place on occasion of regular meetings of the Bundesbank Council and 
of the FOMC (since 1994), collect data on the term structure of interest 
rates the day before the monetary policy meetings. Observations on one- 
day, seven-days rate, lm euro, 3m euro, 6m euro, 12m euro, 3, 5, 7, and 
10-year fixed interest rate swap are available on DATASTREAM and other 
databases; 
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• estimate a term structure for spot rates and the associated curve of instan¬ 
taneous rates; 


• interpret the instantaneous rate as the overnight rate, and derive from the 
curve the expected implicit overnight rate for the day after the monetary 
policy meeting; 


• derive monetary policy shocks subtracting from the observed overnight rate 
the day after the policy meeting the overnight rate implicit in the curve 
the day before the policy meeting; 


• aggregate the above daily measures (concentrated in a few special days) to 
construct monthly measures of shocks 


There are several difficulties that one should overcome in constructing this 
measure of monetary policy shocks. Following Bagliano and Favero [3], we il¬ 
lustrate examples of monetary shocks generated by unanticipated action or by 
unanticipated inaction by the Bundesbank, and examples of markets’ anticipa¬ 
tion of Bundesbank behaviour when expectations on monetary policy turned out 
to be correct and no shocks were observed. We consider the sample 1984-1997. 

Consider first July 1988. In this month the Bundesbank Council met twice, 
on the 14th and on the 28th. During the first Council the Bundesbank didn’t take 
any action, during the second Council it was decided to raise the Lombard Rate 
by 50bp. In Figure 6.12 we report the weekly and the overnight rate, alongwith 
the monetary policy action (PMA). 
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FIG 6.12 Monetary policy interventions and short-term interest rates in 

Germany. July 1988 

We shade areas of three days centered around meetings. We note that no 
monetary policy action was expected during the first meeting, while some action 
was expected before the second one. Six days before the meeting the weekly rate 
contains the first six days of maturity which doesn’t include the action and the 
seventh one which instead does include the action, so the weekly rate should 
start to “reflect” the monetary policy action six days before the meeting. Of 
course the weight of the seventh day is one-seventh so the information doesn’t 
appear clearly six days before, but as we approach the date of the council the 
weight of the action becomes greater and the expectation discloses itself. It can 
be observed that the weekly rate starts reacting three days before the meeting. 
It is also possible than the market realizes that the Bundesbank will act only 
a few days before the Council (say less than six days before), in this case the 
weekly rate starts reacting later than six days before the Council. The weekly rate 
should be the best observed interest rate to identify expectations on monetary 
policy actions. In fact Council meetings take place fortnightly and the 1-month 
rate immediately before any meeting reflects expectations on the outcome of the 
following two meetings. 
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The second episode we consider is the tightening of monetary policy occurred 
after German reunification in January-February 1991. Two meetings were held in 
this period, the 17th of Janauary and the 2nd of February. As Figure 6.13 clearly 
shows, the weekly rate increased sharply just before the first Council revealing 
an expected increase in the interest rates. 


January 1991, two Bundesbank Councils 
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FIG. 6.13 Monetary policy interventions and short-term interest rates in 

Germany. January 1991 

The Bundesbank didn’t act on that meeting.We immediately observe than 
the expected tightening happened during the following Council meeting, when 
the Bundesbank raised the Discount Rate and the Lombard Rate by 50 bp. To 
summarize, on the fourteenth of January we observed a monetary policy shock 
arising from an anticipated action that did’t occur, meanwhile on the second of 
February there is no shock as the policy move has been correctly anticipated. 

The third episode we single out occurred in December 1991 (see Figure 6.14) 
when the Bundesbank tightened the monetary policy, raising once again the 
Discount Rate and the Lombard Rate by 50 bp. The dates of the Bundesbank 
Councils are the 5th and the 19th of December. During the latter meeting the 
Bundesbank surprised the market, and we observe a shock arising from an un- 
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expected policy action. 


December 1991, two Bundesbank Councils 
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FIG. 6.14 Monetary policy interventions and short-term interest rates in 
Germany. December 1991 


The main strength of the methodology based on foward interest rate curves 
is its flexibility and its capability to capture shocks independently from the 
specification and parameterization of a linear autoregressive model. The main 
limitation of this approach is caused by the volatility of very short-term rates 
not related to expectations on monetary policy. Figures 6.15-6.16 reports daily 
observations on the over-night rate and the weekly rate for the estimation sample 
period used in the VAR. 
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FIG. 6.15 The German 7-days rate 
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GERMDRD 


FIG. 6.16 The German overnight rate 

We immediately note a number of blips in the series. Those blips could be 
very damaging to our methodology whenever they happen on occasion of a Bun¬ 
desbank Council meeting. Most of those blips are generated by banks reserves 
management which run into a non perfecly liquid markets, such as on the occasion 
of the last day of the average reserves maintenaince period. We make an effort 
to render our inference robust to blips. In fact, we have estimated our curves 
starting from the 7-day rather than the overnight rate, and our methodology 
of estimation considers the information contained in the whole term structure. 
However, we have run a further check and avoid to label as policy shocks all un¬ 
expected movements in policy rates which have disappeared within a week after 
the Council Meeting. Such correction led us to single out two outliers in 1988:9 
and 1991:12. The 1988:9 outlier, whose determination is described in Figure 6.13, 
is the only one of a relevant magnitude. 

In Figure 6.17 we report the behaviour of the 7-days and the 1-month rate in 
the course of September 1988. No policy intervention was decided in September 
1988, however just before the meeting of mid September we observe an hike 
in the 7-days rate. Such hike is not reflected in the term structure for longer 
maturities (we report 1-month for reference). This hike would have been labelled 
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as a shock by the methodology, however, as it is reversed, within the week after 
the meeting this episode should be considered as a monetary policy shock. 



ECWGM7D ECWGM1M 


FIG. 6.17 The German 7-days and one-month rate in September 1988 

6.8 Empirical results 

Non-VAR measures of monetary policy can be directly compared with VAR 
measures, they can also be used to assess the robustness of the VAR based 
descriptions of the monetary transmsission mechanism, finally, they can be ex¬ 
ploited, within a VAR, to aid identification of other structural shocks. To illus¬ 
trate these possibilities we consider in turn the close economy (US) case and the 
open-economy (US-Germany) case. 

6.8.1 Close economy (US) 

To evaluate the role of non- VAR based measures of monetary policy shock, we 
first estimate the close-economy four-variable version of the VAR model for the 
US and compute impulse response functions of all variables to a shock in the 
Federal funds rate. Our model is specified as follows: 
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(6.32) 


where A is lower triangular and B is diagonal. 

The ordering chosen allows for a contemporaneous response of the policy rate 
to innovations in output, consumer prices and the commodity price level. The 
orthogonalized residual of the Federal funds rate equation, v FF , is identified as 
a monetary policy shock. No structural interpretation is given to the (orthogo¬ 
nalized) residuals from the other equations in the system. We then consider two 
non-VAR measures of monetary policy: that derived from one-month Eurodollar 
forward rate ( EUR%) and that derived from the estimation of the instantaneous 
forward rate curve on occasion of FOMC meetings) IFS us ). These alternative 
shocks are plotted with the VAR based shocks in Figures 6.18-6.19. 



FIG. 6.18 Three-month centered moving averages of EUR$ shocks (solid line) 
and close economy VAR monetary policy shocks (dotted line) 
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FIG. 6.19 IFS us shocks (solid line) and close economy VAR monetary policy 

shocks (dotted line) 

Note that the EUR% measure is available on a larger sample than the IFS us 
measure as the practice of modifying monetary policy rates on occasion of given 
and announced dates started only in the nineties. We report in Table 2 the 
correlations of VAR and non-VAR measures of monetary policy. 

TABLE 2: VAR and non-VAR monetary policy shocks 
Sample: 1988(11)-1996(10) 

Correlation coefficients (standard errors on the diagonal) 

EUR$ IFS us v FF 


Sample: 1988(11)-1996(10) 

EUR% v FF 
EUR% 0.277 
v FF 0.500 0.211 
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Rudebusch [46] using the Federal Funds future contract obtains very similar 
results to those obtained in our shorter-sample to conclude that VAR based 
measured of monetary policy do not make sense. We note that much better 
results are obtained in the enlarged sample. To provide further evidence we 
specify a VAR augmented by the non-VAR meausure of monetary policy shocks, 
considered as an exogenous variable. Following Amisano and Giannini [1], we 
represent the estimated system as follows: 
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where A is lower traingular and B is diagonal. The estimated values of the 
coefficients gi are reported in Table 3. 


TABLE 3:Coefficients on EUR% and IFS US in the benchmark VAR 


Sample 

1988(11)-1997(11) 



yUS 

Pcm 

pUS 

FF 

EUR% 

0.0061 

0.0055 

0.0013 

0.468 

IFS us 

(0.0032) 

(0.0121) 

(1.0633) 

( 0 . 097 ) 

0.0025 

0.0082 

0.0009 

0.356 


(0.0031) 

(0.0116) 

(0.0013) 

( 0 . 099 ) 

Sample: 

1984(1)-1997(11) 




yUS 

Pcm 

pUS 

FF 

EUR% 

0.0026 

0.0007 

0.0058 

0.552 


(0.0016) 

(0.0006) 

(0.0063) 

( 0 . 062 ) 


We note that none of the macroeconomic variables responds to the non-VAR 
monetary policy shocks, while the Federal Fund rates does. As suggested by the 
correlations between shocks results are stronger on the larger sample. We then 
concentrate on this sample and compare impulse responses to monetary policy 
shocks in the traditional benchmark VAR specification with impulses responses 
to non-VAR monetary policy shocks in our augmented specification. 

The results, shown in Figure 6.20, illustrate that a contractionary monetary 
policy shock produces the expected negative effect on output and a persistent 
effect on the Federal funds rate. 

The inclusion of the commodity price index is successful in solving the price 
puzzle: the consumer price level does not show a perverse response to restrictive 
policy. 

The pairs of impulse response functions, based on the VAR and the non- 
VAR shocks, describe a very similar transmission mechanism, supporting the 
evidence already provided by Brunner [9] and Bagliano and Favero [2] with 
different exogenous measures. Despite a correlation of 0.5 between EUR$ and the 
measure of policy shock obtained from the benchmark VAR, the dynamic effects 
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of monetary policy show very close features: both measures capture unexpected 
variations in the policy rate related to monetary policy and the existence of other 
non-policy disturbances does not change the basic features of the response to a 
policy shock. 
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Response of Y us 




Response of Pern 



Response of P us Response of FF 

Fig. 6.20 Responses to EUR$ shocks (solid line) and to F4iZ-based struc¬ 
tural shocks v FF (dotted line) with one standard deviation confidence intervals 
from the benchmark VAR 
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6.8.2 Open economy (US-Germany) 

Let us now consider the open-economy version of the VAR system. We begin by 
a baseline specification which includes the commodity price index: 
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where A is lower triangular and B is diagonal. 

As we have done for the close economy case, we compare orthogonalized resid¬ 
ual of the German call money rate equation (y RaER ^ with the non-VAR measure 
of German monetary policy shocks IFS aER , derived from instateneuos forward 
rates. Figure 6.21 and Table 4 confirms the results for correlations obtained in 
the close-economy case. 
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FIG. 6.21 Three-mo nth centered moving averages of IFS GER shocks (solid 
line) and open economy VAR German monetary policy shocks (dotted line) 


TABLE 4: VAR and non-VAR monetary policy shocks 
Sample: 1984(1)-1997(11) 

jjFLGER 


IFS ger 0.194 
y RGER Q.163 


0.169 


As in the close-economy case, we augment the previously estimated system 
by including the exogenous measure of German monetary policy shocks IFS GER 
described in the previous section. 

The open-economy VAR is now the following: 
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Using our exogenous measure of monetary policy shocks in combination with 
a Choleski ordering with the German policy rate coming last, we are able to 
directly address the issue of simultaneity between German monetary policy and 
the exchange rate. The contemporaneous effect of a monetary policy shock on 
the exchange rate is given by the coefficient on IFS aER in the exchange rate 
equation (< 77 ), while the response of the German interest rate to innovations in 
the exchange rate is endogenized by the ordering chosen. As shown in Table 5, the 
simultaneous relations, we do not observe a significant contemporaneous feed¬ 
back between the German interest rate and the exchange rate in any direction. 
In our framework, this is a testable proposition rather than an assumed iden¬ 
tified restriction. We note that our measure of monetary policy shocks enters 
significantly in the German policy rate equation and that the contemporane¬ 
ous response of U.S. output to German monetary policy shocks is small but 
marginally significant 5 . 


TABLE 5:Coefficients on IFS GER and simultaneous responses of e in the benchmark VAR 
Y~U S PcTYl pU S FF pGER pGER g pGER 

IFS ger -0.007 -0.01 -0.0013 -0.0892 0.00002 0.0029 0.0084 0.230 

(0.002) (0.008) (0.0008) (0.1146) (0.0011) (0.007) (0.0127) (0.097) 

e 1.36 -0.15 -0.15 0.022 0.045 2.44 -0.007 -0.008 

(0.037) (0.11) (1.09) (0.0083) (0.126) (0.79) (0.002) (0.01) 

The pair of impulse response functions shown in Figure 6.22, alongwith one- 
standard deviation bands, confirm qualitatively the results obtained for the close 
US economy: measuring monetary policy shocks using financial market data 
does not alter the main features of the monetary transmission mechanism for 
Germany. 


5 We report impulse responses based on restricting such coefficient to zero; relaxing this 
restriction does not affect the shape and magnitude of the impulse responses. 
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Fig. 6.22 Responses to IFS GER shocks (solid line) and to R4iZ-based struc¬ 
tural shocks y RGER (dotted line) with one standard deviation confidence inter¬ 
vals from the benchmark VAR 
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6.9 Conclusions 

The VAR approach to the monetary transmission mechanism is aimed at the 
derivation of stylized facts to help the selection of the theoretical model to be 
used for simulating the effects of monetary policy. The identification of parame¬ 
ters in these type of models does not allow to separate deep parameters describ¬ 
ing taste and technology from expectational parameters dependent on policy 
regimes. However, by estimating this type of models on a single policy regime 
and by describing the responses of variables to structural shocks of interest, it 
is hoped to derive some stylized facts on the monetary transmission mechanism. 
Such stylized facts should help the selection of the relevant theoretical model to 
be used for policy simulation. We have intepreted the choice of concentrating 
on shocks as a consequence of the impossibility of identifying deep parameters 
independently from expectational parameters. Unfortunately, structural shocks 
are not directly observable and the imposition of a set of identifying restrictions 
is a necessary prerequisite for the analysis. Given the aim of the analysis, it is 
essential that identifying restrictions are posed independently from specific theo¬ 
ries. All the developments of the Choleski ordering that we have discussed in the 
chapter provide the researcher with tools for achieving this aim. In particular, we 
have shown how informations from financial markets can be used both to assess 
robustness of results derived within traditional VAR models of the monetary 
transmission mechanism and to aid identification in cases when traditional anal¬ 
ysis does not deliver a sufficient number of restrictions. Within this framework 
for analysis it is also natural that the number of identifying restrctions is kept 
at a minimum. VAR models of the monetary transmission mechanism are very 
rarely cointegratd VARs. We have seen that multivariate cointegration analy¬ 
sis requires the solution of a long-run identification problem, and that imposing 
cointegrating restrictions on a VAR in levels increases efficiency in the estimator 
at the cost of the risk of inconsistency when the incorrect identifying restrictions 
are imposed. The monetary transmission mechanism is a short-run phenomenon 
and this explains why researchers prefer to employ unrestricted VAR to eval¬ 
uate impulse response analysis over a short to medium horizon. Cointegrated 
VAR are however an almost inevitable choice when the relevant, theory neutral 
restrictions, are long-run restrictions. 

As VAR models are the natural empirical counterparts of dynamic general 
equilibrium monetary models, their statistical adequacy is not as closely scruti¬ 
nized as the adequacy of reduced form specification within the LSE approach. 
In particular, in many of the applications outliers are not removed and non¬ 
normality is not an uncommon feature. Parameters stability is also an issue in 
the debate. As far as identification is concerned, the idea of using restrictions 
neutral with respect to the theories under scrutiny is nice but not always imple- 
mentable. In fact, VAR models of the monetary transmission in open economies 
have not been as successful in establishing stylized facts, probably because of the 
difficulties in generating a ’’neutral” identification scheme. Moreover our analysis 
of the empirical evidence on the monetary transmission mechanism has shown 
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a high level of uncertainty associated with VAR based results. In fact, rather 
large standard errors are associated to the point estimates of impulse response 
functions. The more so in the case of VAR in open economies, where practition¬ 
ers have developed the habit of reporting one-standard deviation bands rather 
than two standard deviations bands. The main consequence of such uncertainty 
is that the aim of the exercise, once again model selection, is difficult to achieve 
in practice. 
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7 


INTERTEMPORAL OPTIMISATION AND THE GMM 

METHOD 


7.1 Introduction 

The intertemporal optimisation approach to macroeconomic thoery takes the 
Lucas critique very seriously and is based on the convinction that questions like 
“How should a central bank respond to shocks in macroeconomic variables?” are 
to be answered within the framework of quantitative monetary general equilib¬ 
rium models of the business cycle. 

The evaluation of the effects of monetary policy is a question for theoretical 
models rather than for empirical ad-hoc macroeconometric models. We have seen 
that VAR based empirical evidence helps the selection of the relevant theoreti¬ 
cal model. However, once the model selection problem has been solved, two are 
the relevant issues: parameterization and simulation. In this chapter we mainly 
concentrate on parameterization while we shall devote the next chapter to sim¬ 
ulation and policy evaluation. The intertemporal optimisation approach takes 
no interest in the parameters estimated by traditional macroeconometric mod¬ 
elling. In fact, traditional structural econometric modelling delivers parameters 
which are convolutions of the interesting “deep” parameters describing tastes 
and technology and of expectational parameters, which are dependent on the 
specific policy regime. 

Interestingly, the intertemporal optimisation-Rational Expectations paradigm 
generates in a very natural way an alternative econometric approach to estimate 
deep parameters of interest: the Generalized Methods of Moments (GMM). 

This chapter is devoted to the illustration of the empirically relevant aspect of 
the GMM methodology. We shall consider applications to consumer’s behaviour 
and the estimation of monetary policy rules. We shall start by illustrating the 
close relationship between the econometric methodology and the intertemporal 
optimisation achieved by the implementation of the GMM method in the esti¬ 
mation of Euler equations. We shall then consider technically the definition of 
the estimator, the problems related to the estimation of the covariance matrix 
and the inference within GMM models. 

Having considered the technical aspects of the estimator, we evaluate its 
success in the new-classical camp and its extremely rare utilization by Keyne¬ 
sian macroeconomist by giving an econometric interpretation to a statement by 
Mankiw, Rotemberg e Summers [23] who assert that 

“... The major difference between modem neoclassical and traditional Key¬ 
nesian macroeconomic theories is that the former regard observed levels 
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of employment, consumption and output as realizations from dynamic 
optimizing decisions by both households and firms, while the latter regard 
them as reflecting constraints on households and firms ...” 

We shall conclude by showing applications of the GMM approach to the 
estimation of deep parameters describing (representative) consumer behaviour 
and to the estimation of deep parameters describing central banks’ preferences 
in monetary policy rules. 

7.2 Euler Equations and “closed form solutions”: 

Consider the standard optimisation problem for the representative consumer: 


OO 

Max E t y (1 + Sy l U (c t+l ) (7.1) 

subject to the following constraints 

At+i = (1 + r) A t +i- 1 + yt+i — c-t+i 


lim E t A t+i (1 + r) 1 = 0 

OO 

Where y is disposable labour income, c is consumption of non-durable goods 
and services, A is wealth (a single financial asset) giving a return of r, 17 is a 
utility function featuring both intertemporally and intra-temporal separability. 
All variables are expressed in real terms. The 6 parameter describe the rate 
of the intertemporal preferences of the representative consumer, who has an 
infinite horizon and does not face liquidity constraints of any form.Therefore, 
she can run negative balances of A in any period with the only constraint that 
the present discounted value of wealth in time t approaches zero as t approaches 
infinite (transversality condition). Lastly, by E t we denote expectation formed 
conditionally upon the information set available at time t. 

We solve the intertemporal optimisation problem by finding the maximum of 
the following Lagrangean function: 


OO 

Max EtS (1 + S) Gt+i 

c t+ i,A t+i 


(7.2) 


Gt+i — U ( Ct+i ) + At+i ( A t +i — (1 + r) At+i-i — yt+i + ct+i) 

We assume that the real return is non-stochastic and that the utility function 
is Constant Relative Risk Aversion (CRRA): 
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t/ ( c ‘+*) = r^ ( 7 - 3 ) 

where the 7 parameter describes the consumer’s risk aversion. This specifica¬ 
tion completes the setup of our problem. We note that there are two parameters 
which describes tastes and could be defined as deep in Lucas’s terminology: 7 
and 6. 

First order conditions for optimality could be stated as follows: 


E t (c^) + E t \ t+i =0 V* (7.4) 

/I + r 

Et^t+i — E t ( 

By eliminating the Lagrange multipliers and by considering the specific case of 
1 = 0, we obtain: 


t+i+l 


= 0 


VI 


E t (j^~s c t+i - c t~ 7 ) = 0 ( 7 - 5 ) 

Some consideration on equation (7.5) , known as the Euler equation, are in 
order. First, this relation clearly confutes the idea that economic theory gives 
mainly predictions on the long-run behaviour of economic variables: in fact, 
the Euler equation imposes restriction on the short-run dynamics of economic 
variables. Second, the only parameters entering equation (7.5) are 7 and 6, the 
“deep” parameters describing consumer’s preferences. Third, as equation (7.5) 
does not represent the ’’closed form solution” of the intertemporal optimisation 
problem but just the first order condition for optimality, it cannot interpreted as 
a consumption function. However, from (7.5) we derive the falsifiable proposition 
that, under the joint hypothesis of Rational Expectations, the only significant 
variable in predicting consumption at time t + 1 given the information available 
at time t is consumption at time t. Under our hypotheses, the logarithm of con¬ 
sumption behaves as a “random walk” (Hall, [17]). The conditional expectation 
for time t + 1 taken at time t of the expression between brackets in (7.5) is 
zero, and such expression is orthogonal to any other variable than consumption 
included in the agent’s information set at time t. Labelling f t +i the expression 
between brackets in (7.5)we have: 


Etft+i — 0, Etft+iZt — 0 


(7.6) 


where z is a vector containing any economic variable observable at time t. 
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Note that the Euler equation does not have any implication for the relation 
between consumption and other economic variables: the significance of income in 
explaining contemporaneous fluctuations in consumption is perfectly compatible 
with our intertemporal optimization model, which only rules out the significance 
of income at time t in explaining the difference between the marginal utility of 
savings at time t for time t + 1 (if consumer does not consume one unit at time 
t, she will invest it in the financial asset and have (1 +r) unit at time t + 1 , the 
discounted value of this quantity being yAl. ) and the marginal utility of con¬ 
sumption at time t. Finally, note that if the rate of intertemporal preference and 
the interest rate are equal, fluctuations in consumption are determined exclu¬ 
sively by stochastic shocks. To further illustrate the relationship between Euler 
equation and consumption function and to provide a firmer background to our 
discussion of econometric methodology we take advantage of the simplicity of 
our specification to derive analytically a closed form solution to our intertempo¬ 
ral optimisation problem. To simplify matters even further consider “certainty 
equivalence”. 

From the first order conditions we derive the following relationship between 
consumption at time t and consumption in any period following t: 

Ct+l = Ct (t+s) ' (7 ' 7) 

Aggegate now over time the period budget constraint and impose the transver- 
ality condition to obtain: 






oo 


Vt+i 




+ (1 + r) A t -i. 


(7.8) 


By using (7.7) in (7.8) to substitute consumption in all future period with 
an expression in terms of current consumption, we obtain : 


c t - (p - 1) ( V + {l+r)A t -j j (7.9) 

i. 

where p = U+Q^ ^ anc j ^ j g assumec i to be greater than one. When 6 = r 

(l+r) 7 

expression (7.9) simplifies drastically in the following: 

c ‘ = r (EtttV + + r ) At ~) • ( 7 - 10 ) 

V 7^( 1 + r ) / 

Equation (7.10) represents the closed form solution to the intertemporal op¬ 
timisation problem and it is the structural consumption function for our rep¬ 
resentative consumer under the hypothesis of certainty equivalence. Note that 
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consumption is function of permanent income, which includes current income, 
and that the reaction of consumption to the modifications in the real interest 
rate depends on an income effect and on a substitution effect. The sign of the 
income effect depends on the consumer’s financial position: if the consumer is 
in debt (A t ~ 1 < 0 ) the income effect is negative, while for a consumer in credit 
(At- 1 > 0 ) the income effect is positive. The substitution effect is always nega¬ 
tive, as an increase in the interest rate lowers the discounted stream of future 
income. 

The closed form solution is useful to understand the skepticism of newclassical 
economists towards the use of empirical ad-hoc structural macroeconometric 
models to simulate the effect of macroeconomic policy. To see the point quickly 1 
let us re-write relation (7.9) omitting “perfect foresight”: 

c t = (p~ 1 ) + £ t ( 7 -H) 

\ j =£( 1 + r ) / 


St 




i=0 


(Ut+i EfUt+i) 
(1 + r) 1 


In order to interpret (7.11) in the light of traditional ad-hoc macroeconometric 
models, which do not usually explicitly incorporates expectations, we need to 
solve for future income in terms of current income. We do so by assuming a 
simple autoregressive process for income: 


yt = a 1 y t -i+u t (7.12) 

using repeatedly (7.12) in (7.11) we have: 

c t = (p ~ 1) (1 + r ) A t - 1 + (p — 1) ^ _^ r — Vt + £ t- (7-13) 

Parameters in (7.13) are convolutions of the deep parameters contained in p 
and of the expectational parameter Qq, which will change every time the process 
generating income is subject to modification. Moreover, given the estimation 
of (7.13) , the structural parameters describing consumer’s tastes, 6 and 7 , are 
even not identifiable. Note also that the residual term in (7.13) is, by construnc- 
tion, autocorrelated. If we can represent autocorrelation in the following, simple, 
manner 


St — 8st~ 1 +v t (7.14) 

then the best representation of the Data Generating Process will then be the 
following: 

Whis amounts to a little cheating, which simplifies matter greatly without having any 
substantial effect on our final conclusions. 
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Ac t = (p - 1) (1 + r) AA ( _i + (p-l) 1 + r —A y t - (7.15) 

1 + r — ai 

- (! - S) fc t -1 -(p- 1 ) (l+r)A t -2 ~ (p~ 1 ) * + _^ —2/t-i) +«t 

V J- T J 

which, obviously, is an Error Correction Mechanism (ECM). 

It is clear that when the income generating process is constant a specifica¬ 
tion like (7.15) will perform extremely well in fitting the data. Note that such 
specification can be obtained without any reference to the theoretical intertempo¬ 
ral optimisation approach, being derived by LSE type econometric specification 
search within the class of ECM representations of cointegrating regressions. How¬ 
ever, if the Data Generating Process is the one postulated by the intertemporal 
optimization theory, then the estimated model cannot be used for policy simula¬ 
tion. No empirical question involving simulating the impact on consumption of 
different policies determining the income process can be meaningfully answered 
on the basis of the estimation of a model like (7.15). In fact the estimated pa¬ 
rameters are function of the parameters in the income process and they become 
misleading if the interesting policy to be simulated implies a change in the in¬ 
come generating process. Within the intertemporal optimisation framework the 
answer to interesting policy question has to be based on the theoretical model 
rather than on an empirical ad-hoc macroeconometric model. Therefore the im¬ 
pact of different policy on consumption is to be based on the direct simulation of 
alternative processes for income within the framework of the theoretical model 
(7.11). Obviously, to implement meaningfully this approach, some estimate of the 
parameter p and hence of the parameters 6 and 7 , which describes appropriately 
the preferences of consumers are needed. Now, the Euler equation qualifies im¬ 
mediately as the best relations to be estimated empirically for the identification 
of the parameters of interest. In fact, it allows identification of the parameters 
of interest and it does not depend at all on the expectational parameters. More¬ 
over, it allows by its nature the implementation of an estimator: the Generalised 
Method of Moments. We devote the next section to the econometric analysis of 
this estimator. 

7.3 Estimating Euler equations: The GMM method. 

Generlizing the results of the specific problem discussed in the previous section we 
can represent the first order condition from a generic intertemporal optimisation 
problem as follows: 


E t [/ (x t+i ,0)z t ] = 0 


(7.16) 


where 9 is the (p X 1) vector containing the parameters of interest and z is the 
(n X 1) vector of variables that theory suggests orthogonal to / (x t+ j, 0 ) . In our 
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example 6= (S,j) , f (x. t +i,0) = — c t 7 Jand z t contains any variables 

observable at time t other than consumption. 

It is intuitively clear that a necessary condition for identification of param¬ 
eters of interest is n > p, with overidentification in the case of strict inequality 
and just-identification in the case of equality. If n < p, then the parameters 
of interest are not identified. Let us concentrate on the over-identification case, 
of which just-identification is a special case. This is going to be the relevant 
case in many economic example, as deriving Euler equations from intertempo¬ 
ral optimisation and Rational Expectations usually selects a potentiallly infinite 
number of valid instruments. Think of our application to the consumer prob¬ 
lem: any lagged variable is a valid instrument under the null that the rational- 
expectations/intertemporal optimisation model is the Data Generating Process. 

The estimator is “naturally” derived from (7.16) by substituting population 
moments with sample moments: 


( x t+i,9)zt]= 0 (7.17) 

7 t =i 

where T is the size of the available sample. Obviously, in case of over-identification 
(7.17) produces a system of n equations in p unknowns, which does not admit 
a unique solution. This problem is solved by considering p linear combinations 

of the n first order conditions and therefore by minimising the “Euclidean dis- 

T 

tance” between ^ ^ [/ (x t+ q 6) z t ] and the null vector. This implies solving the 

t= l 

following minimisation problem: 


min 

e 


' T 

{xt+i,0) z t ] 

.£ = 1 


A 


' T 

{xt+i,0) Z t\ 

,£=1 


(7.18) 


where A is a an appropriate (n X n) weighting matrix. By defining a (T X n) 
matrix F (x t +j,z t , 0) ,with typical element / (x t +j, 0) Zj t , where j = 1, ...n, t = 
1, ...,T, the minimisation problem can be re-written as: 


mi ni'F (x t+i , z t , 6) AF (x t+i , z t , 6) i 


where i is a (T X 1) identity vector. It can be shown that any symmetric positive 
definite matrix A will yield a consistent estimate of the vector of parameters of 
interest. However, Hansen [18] has shown that a necessary (but not sufficient) 
condition to obtain an symptotically efficient estimate of 9 is to set A equal 
to the inverse of the covariance matrix of the sample moments. The intuition 
behind this choice is simple: less weight is put on the more imprecise conditions. 
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Therefore if 'll = Var 


E [/ (x t +i,0)z t ] 


.£=1 


solving the following minimisation problem: 


, GMM estimates are obtained by 


min 

e 


T 


( x ‘+< 


z *] 




v ( x ‘+< 


£=i 


z *] 


(7.19) 


Note that in general, as 'h is function of 9, it would be necessary to proceed in 
at least two step. In fact, exploiting the fact that any arbitrary weigthing matrix 
will deliver consistent estimates of 0. This vector of parameters is estimated first, 
then a 'h is constructed and the minimization of (7.19) is then performed. Of 
course, the two-step procedure is easily extended to an iterative procedure. 

Hansen [18] has shown that the minimised criterion function can be also 
used to test the validity of instruments in case of over-identification, in fact the 
quantity: 


J = (X! [/ Ut+i,0)z t 

\t=i L 





/ Z t 


(7.20) 


is distributed as a y 2 with n — p degrees of freedom. The quantity (7.20) is 
known in the literature as the J statistic. GMM is a very general class class of 
estimators and many of the known estimators can be set up as special cases of 
GMM. Consider for example the Generalised Instrumental Variables Estimator. 

The relevant problem is to estimate the vector of unknown parameters /3 in 
the linear model : 


y = X/3 + u (7.21) 

where y is a (T X 1) vector of observations on the dependent variable, X is a 
( Txp ) matrix of observations on the explanatory variables, /3 is the (p X 1) vector 
of parameters of interest, and u is the (T X 1) vector of observations on the error 
term with zero mean and variance-covariance matrix equal to <7 2 I. Assume that 
X are not weakly exogenous for the estimation of the parameters of interest, we 
have then: 


p lim — X'u ■=/=■ 0 


(7.22) 


However, there exists a Z matrix containing T observations on n valid in¬ 
struments, for which we have : 
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1 , , 

plim-Z'u = 0 (7.23) 

Condition (7.23), which defines instruments as valid, gives also a set of or¬ 
thogonality restrictions to construct a GMM estimate. Let us concentrate on 
the overidentification case, where n > p. Applying formula (7.19) , the relevant 
estimate is derived by solving the following problem: 

Min( u'Z’E^Z'u) (7.24) 

where the appropriate choice for the matrix 4/ is: 

4> =E (Z'uu'Z) = ct 2 Z'Z (7.25) 


Therefore, the GMM estimate will minimize the following criterion: 


Min f^u'Z (Z'Z) 1 Z'u 
f) \a 2 


(7.26) 


which admits GIVE as the solution : 


/3=(x'Z(Z'Z) X Z'x) X X'Z(Z'Z) 1 Z'y (7.27) 

Similarly, the J-statistic will take the following form: 

^a'ztz-zi-z-a 

s z 

where u = y — X (3 and s 2 = (7.28) is distributed as a y 2 with n — p degrees 

of freedom and it is the very well known test for the validity of instruments 
originally proposed by Sargan [29] within the context of the GIVE estimator. 

7.3.1 Covariance Matrix Estimation 

So far we have implicitly considered the case in which the empirical moments 
were serially independent. In general it is worthwhile to relax such assumption, 
as in many macroeconometric applied cases some dependence in the empirical 
moments will be generated. Think for example the case of estimation of Central 
Bank reaction functions. As we shall see later, a Central Bank’s policy rule 
can be specified by assuming that CB set their instrument, the interest rate, to 
react to contemporaneous output gap, the difference between implies current and 
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potential output, and to deviation of future expected inflation from the target 
for inflation. Future inflation is the relevant variable because the existence of 
lags between monetary action and their effect on the economy makes reacting 
to contemporaneous target useless. The literature takes the relevant horizon for 
future inflation to be about one-year. So the following rule could be specified on 
monthly data: 

r t = a 0 + a x E t ( 7 r t+12 - <) + a 2 E t (y t -yt)+ v t (7.29) 

where v t is an exogenous i.i.d. disturbance. To fit (7.29) to the data, the 
unobservable forecast variables are eliminated by rewriting the rule in terms of 
realized variables as follows: 

r t =a 0 + ai (7r t+ i2 - <) + a 2 (y t ~yt)+ £t (7-30) 

£t = at [E t (7T t+ i2 - tt*) - (7Tt+i2 - <)] + a 2 [Et (y t -y* t )~ (Vt ~ y[)\ +v t 

(7.31) 

Labelling z t the vector of variables within the Central Bank’s information set at 
the time the interest rate is chosen, we can construct the GMM estimator using 
the following set of orthogonality condition: 

E t (e t | z t ) = 0 (7.32) 

however, by construction, the composite disturbance term e t features an MA(ll) 
structure and empirical moments cannot be considered as serially independent. 

To deal with this case we rewrite 'Ey the covariance matrix of the empirical 
moments, as follows: 

" T T 

^ = lim -VVe 

T^oo T 

- p=lq=l 

where F q (xt+i, z t , 9) is the q th row of the (T X n) matrix F (xt+i, z t , 6). The 
first step to find a consistent estimator of ’]/ is to define the autocovariances of 
the empirical moments as follows: 

1 T 

r(j') = E { F p ( ^t+i, 2 t,e) F P ^ j (x t +i,Zt,d)) for j > 0 (7.34) 

p=j+i 
1 T 

T U) = f E ^(^ +j (xt + i,z t ,0)F p (x t+i ,z t ,0)) for j < 0 (7.35) 

p=-j+i 

In terms of the (n X n) matrices T ( j ) , the right hand side of (7.33) without 
the limit becomes: 


K( 0)F q ( 8)) 
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n— 1 

*"= Y, r 0')- ( 7 - 36 ) 

j=-ra+l 

If there were no serial correlations between observations, then only T (0) would 
be non-zero and we would have 

1 T 

*" = r (o) = ~Y E (f; (x t+i , z t ,e) f p (x t+i , z t , e )) (7.37) 

P =i 

which could be useful to deal with heteroscedasticity in the empirical moments. 
To see this point empirically let us consider again the case of GIVE with het- 
eroscedastic disturbances. 

The relevant problem is to estimate the vector of unknown parameters /3 in 
the linear model : 


y = X/3 + u 


(7.38) 


where y is a (T X 1) vector of observations on the dependent variable, X is 
a (T X p) matrix of observations on the explanatory variables, /3 is the (p X 1) 
vector of parameters of interest, and u is the (T X 1) vector of observations on 
the error term with zero mean and variance-covariance matrix equal to G, where 

K • o 1 


G = 


. As before we assume that X are not weakly exogenous for the 


[o 0 o \J 

estimation of the parameters of interest, but there exists a Z matrix containing 
T observations on n valid instruments. Applying formula (7.19) , the relevant 
estimate is derived by solving the following problem: 


Min( u'Z’E^Z'u) (7.39) 

where the appropriate choice for the matrix ’]/ is: 

1 T 

^ = r (0) = Z'GZ = — (u 2 p ) Z' p Z p (7.40) 

p=i 

which can be consistently estimated by using any consistent estimator of the 
parameters of interest and by substituting E (w^) with just the square of the 
corresponding residual. 
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This estimator is intepretable as an extension of the Heteroschedaticity Con¬ 
sistent estimator proposed by White within the OLS framework to the GIVE 
case. 

Having discussed this application let us go back to the general case of se¬ 
rial correlation of unknown form. As remarked by Davidson-MacKinnon [7] it is 
tempting to estimate the autocovariances simply by omitting the expectations in 
that expression, evaluating the F p at some consistent preliminary estimate of the 
vector of parameters of interest and then by substituting the T (j) obtained into 
(7.37) in order to obtain a suitable estimate of ’]/. This would just genralized 
the procedure implemented in the case M/ = T (0) . However while the sample 
autocovariance matrix of order zero evaluated at a consistent estimate of 6 with¬ 
out the expectation is a consistent estimate of the true autocovariance matrix 
of order zero T ( 0 ) , the sample autocovariance of order j evaluated in the same 
manner it is not consistent for the true autocovariance matrix of order j. To see 
this think of the case j = T — 1, when T (j) has only one term. No law of large 
numbers can conceivaby apply to a single term and therefore T (j) will tend to 
zero a T goes to infinite on the account of the factor T _1 in the definition. An 
empirical way out to the problem is to limit our attention to models in which 
the autocovariance of order j does not tend to zero as T goes to infinite. Then 
it seems reasonable to truncate the sum by eliminating terms for which |jj is 
greater than some threshold p. We have then 

$ = f(0) + t[f(j)+fH)]. (7.41) 

3= 1 

The choice of p, the lag truncation parameter is not a difficult issue. In many 
cases the appropriate lag truncation parameter is suggested by the model which 
leads the specification of the ortogonality condition. In our above example of the 
policy rule, the obvious choice is p = 11. In case the choice of the lag truncation 
parameter is not suggested by the economic problem at hand, statistical criteria, 
related to the number of observation in the sample and to the length of the 
memory of the data (i.e. degree of autocorrelation in the sample moments) can 
be chosen 2 . There is however a more serious difficulty than the choice of the 
lag truncation parameter associated with (7.41) , in fact it need not be positive 
definite. Newey and West [24] have proposed a solution to this problem: multiply 
the sequence of the T (j)’ s by a sequence of weights that decreases as |jj increases. 
Specifically, they propose the following estimator: 

P r ■ -| 

$ = f(o) + £ i-rrr f 0')+?H) • 

3=1 1 P J 

2 See for example EViews 3 User’s guide, Chapter 18, pp.488-492. 


(7.42) 
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For a complete discussion of the properties of this estimator and for some alter¬ 
native solutions to the positive definitess problem see Andrews [1], 

Consider once more our GIVE example in which the matrix 0 takes now a 
very general form: it is full and allows for heteroscedasticity and autocorrelation 
of an unknown form. The GMM estimator becomes now: 

P=(x' Z$ _1 Z'X^ 1 X'Z^~ 1 Z'y (7.43) 

which is also known as the two-step two-stage least squares. In fact, after a 
preliminary consistent estimate of the parameters of interest is derived by an 
ordinary IV procedure, 4/ is constructed, having chosen the lag truncation pa¬ 
rameter and applying the Newey-West correction. The estimation of 4/ allows 
the construction of more efficient estimate of the vector of parameters of interest 
via formula (7.43). The estimator proposed by Cumby, Huizinga and Obstfeld 
[6], derived in the framework of rational expectations models, can be considered 
as a special case of the above estimator. 

Lastly note that, as far as inference is concerned, the J-statistic can still be 
constructed as in the simple case of not autocorrelated and homoscedastic sample 
moments with the appropriate estimator of the variance-covariance matrix of the 
sample moments. 

7.4 The limits to the Euler Equation-GMM approach 

In this section we shall consider the limits of the Euler Equation-GMM approach 
both from a theoretical and an applied point of view. The analysis of the theo¬ 
retical problems is aimed at showing the great difficulties in implementing GMM 
when market imperfections are brought into the intertemporal optimisation ap¬ 
proach. The empirical problem are related to the nature of “deep” parameters 
estimated by GMM. Such parameters should describe taste and technology and, 
by their nature, they should therefore be constant over different sample period. 
They do not seem to be constant. 

7.4.1 Theoretical problems 

To show the difficulties in implementing GMM estimation of Euler Equation 
derived outside the pure New-classical paradigm take the Mankiw et al. [23] 
quotation reported at the beginning of this chapter literally and introduce liquid¬ 
ity constraints in the intertemporal optimisation problem for the representative 
consumer. We have: 


OO 

Max E t yTl + Sru(c t+l ) 

ct + i,At+i ; 


(7.44) 


subject to the following constraints 
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At+i — (1 + r) A t +i -1 + yt+i — ct+i 


lim E t A t+i (1 + r) 1 = 0 

i—>• oo 


At+i A bt+i 

The only difference with the problem originally considered is that we have 
introduced a finite, although non necessarily positive, limit b to the stock of debt 
available to the consumer in each period. 

The liquidity constraint could be re-written as follows: 


Ht+i — (1 + r) + A t - 1 + (1 + r) (yt+i — Ct+i ) — b t +i > 0 (7-45) 

i=0 

and the intertemporal optimisation problem becomes now the following: 


OO 

Max (1 + S) 1 G t +i 

c t +i,A t +i -f-; 


(7.46) 


Gt+i — U ( Ct+i ) + \+i ( A t +i — (1 + r) At+i-i — yt+i + ct+i) + y-t+i^-t+i 

where A is the Lagrange multiplier associated to the stock-flow relationship 
between wealth and savings and y is the Lagrange multiplier associated to the 
existance of liquidity constraints. When the liquidity constraint does not bind 
and y = 0, we are back to the original intertemporal optimisation problem. 
Maintaining the CRRA parameterisation for the utility function, the first order 
conditions for optimisation are now: 


' -| . C \ 

Et (c t ^A) — EtXt.+i. = Et ( -- ) Ut^; Vi 


+ A t +i — Hit 


1 + r 


yt+i 


(7.47) 


EtXt+i — E t ^ i ^ ^ Xt+i+i j — 0 Vi 

Setting i = 0 and combining the first order conditions we derive the Euler 
Equation as : 
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Et ^ i + g c t+i c t 7 ^ + r ^ Pt+i + Vtj ~ (7.48) 

Analysing equation (7.48) we note immediately that the specification of the 
utility function and the assumption of rational expectations are not sufficient now 
to eliminate all the non-observable variables from the Euler equation. Therefore, 
GMM is not empirically feasible anymore. This simple example shows rather 
clearly why when constraints are introduced in the intertemporal optimisation 
problem, the GMM method becomes much less popular. 

It is fair to say that several solutions have been proposed to the problem gen¬ 
erated by the introduction of liquidity constraints, but none of them replicate the 
neat correspondance between solution to the economic problem and implemen¬ 
tation of the econometric methodology obtained in the case of the intertemporal 
optimisation without market imperfections. 

Deaton [8]-[9] observes the impossibility of finding an analitycal solution to 
(7.48) and proposes to characterise the properties of the numeric solution ob¬ 
tained under the hypothesis of very simple DGP for the income process. However, 
even for a very simple autoregressive process for income, the computational bur¬ 
den is rather heavy. Pesaran-Smith [27] propose to approximate the unknown 
Lagrange multipliers by a general function of observable variables. Within this 
context, it is important to gather institutional information to help the iden¬ 
tification of the appropriate functional form and of the appropriate argument 
for such function. Favero-Pesaran ([11] apply this methodology to the empiri¬ 
cal modelling of oil investment using institutional and geological information to 
identify the appropriate function. Abandoning time-series there is the possibil¬ 
ity to revert to panel data to identify liquidity constrained agents Zeldes [34], 
Brave attempts to identify the relevance of liquidity constraints using time series 
data have been proposed by Campbell-Mankiw [2], and Jappelli-Pagano [21], 
Aggregate time series consumption is thought of as the result of the aggregation 
of consumption by two type of agents: those who are liquidity constrained and 
those who are not liquidity constrained. To allow aggregation utility is assumed 
to be quadratic, then the behavoiur of the uncostrained agents is described by 
the usual Euler equation while constrained agents are assumed to consume all 
their disposable income in each period. By assuming that a fixed proportion of 
income accrues to each type of agents the Euler equation for the unconstrained 
agents and the consumption function for the constrained agents are aggregated 
into a macroeconomic consumption function, which, interestingly, takes the form 
of an ECM model. One of the estimated parameter in such consumption function 
is the proportion of income accruing to the constrained agents. The importance 
of liquidity constraints in the economy can therefore be empirically evaluated 
on macro time-series data. It is not clear however why the proportion of income 
accruing to the liquidity constrained agents is thought of as a parameter rather 
than a variable. 
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7.4.2 Empirical Problems 

The main empirical problem with the GMM approach to estimate structural 
parameters has been noted by a series of authors 3 for the US data. In fact it has 
been observed that in general the parameters estimated on aggregate time-series 
data by implementing GMM on Euler equations derived by different intertem¬ 
poral optimisation problem are not stable over time. Such instability is clearly 
not compatible with their nature of parametrs describing taste and technology 
suggested by the theoretical models. There are several possible interpretation 
of instability: it could signal the incorrect specification of the estimated model 
or it could be generated by the fact that representative agents model are ap¬ 
plied on aggregate data without taking proper care of aggregation. This second 
interpretation has generated a research programme which refrain from estimat¬ 
ing the “deep” parameters from aggregate macroeconomic time-series. “Deep” 
parameters are insteadtaken from microeconometric studies on disaggregated 
data, using these parameters thoeretical model are then calibrated and simu¬ 
lated, finallly properties of the simulated data are compared with properties 
of macroeconomic time-series to evaluate the ability of the theoretical model 
to replicate features of the real data. We shall concentrate on the calibration 
methodology later on. We conclude now our analysis of the GMM method by 
looking at empirical applications of this methodology. 

7.5 An application to the consumer’s problem 

The first illustrative example which we consider involves the estimation of the 
Euler equation (7.5) : 


Et 


1 + ft+l -7 

1+fi C ‘ +1 “ C ‘ 


= 0 , 


using the a data set on monthly US data, which is the version of the Hansen- 
Singleton [20] data-set made available as a tutorial data-set for Microfit version 
4.0 4 . The data set is available in Excel format as HS.XLS. It contains monthly 
data for the sample 1959:3-1978:12 on the following variables: 


XI: ratio of consumption in time period t — 1 to consumption in time period t 
X2: one plus the one-period real return on stocks. 

Estimation of the Euler equation is implemented using the appropriate rou¬ 
tine in E-Views, using the Bartlett weights and the Newey-West criterion to 
choose the lag truncation parameter. The results are reported in Table 1. 

3 See for example Ghysels and Hall [14]-[15], Garber and King [13], Oliner, Rudebusch and 
Sichel [25], 

4 See Pesaran and Pesaran [28], 
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Table 1: C(l)*(XffiC(2))*X2-l 


Coefficient 

Estimate 

Std. Error t-Statistic 

Prob. 

C(l) 

C(2) 

0.998082 

0.891202 

0.004465 223.5548 

1.814987 0.491024 

0.0000 

0.6239 

S.E. of regression 0.041502 
Sum squared resid 0.404766 

J-statistic 0.006453 

Durbin-Watson stat 1.828192 


Instrument list: 

C Xl(-l) X2(-l) 



Note that the parameters are estimated by using the following three orthog¬ 
onality conditions: 


E t -11=0 


1 4- 8 \ c t+i 


E, I -l|r,=0 


1 4- 8 \ct+i 


-0(¥'= o 


therefore we have one over-identifying restriction whose validity can be tested 
by using the J-statistic. Such statistic, distributed as a y 2 with one degree of 
freedom, is easily computed multiplying by the number of observations the J- 
statistic reported in the E-views output. Given that the observed value is 1. 5294 
(237*0.006453), we do not reject the null of validity of instruments. Note that the 
coefficient of risk aversion is estimated rather poorly, while the discount factor 
is instead estimated rather precisely. To evalute the relevance of the correction 
for heteroscedasticity and correlation of unknown form we implement the GMM 
without such correction. This result is obtained by defining a variable u t taking 
a value of zero everywhere and estimating the following model : 


_ 1 + r t+i f c t 1 , 

“•+■ - wrw (yy:J - 1 +-'•+■■ 

The GMM estimates can be obtained by estimating (7.49) by instrumental vari¬ 
ables, using the constant, r t , and as instruments. The results of TSLS 

estimation are reported in Table 2.: 
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Table 2: U=C(1)*(XWC(2))*X2-1 


Coefficient 

Estimate 

Std. Error 

t-Statistic 

Prob. 

C(l) 

0.998945 

0.004947 

201.9470 

0.0000 

C(2) 

0.864734 

2.044035 

0.423052 

0.6726 

S.E. of regression 0.041545 

Durbin-Watson stat 1.829335 

Sum squared resid 0.405609 





Instrument list: 

Xl(-1 )X2(-1 )C 



Results are unaltered. An interesting exercise here is to assess the stability 
of estimated parameters over time. 

7.6 GMM and monetary policy rules 

We have already introduced the discussion of the estimation of monetary rules 
by GMM to illustrate the issue of the possibility of correlation in the sample mo¬ 
ments. We now investigate this topic at greater depth, referring to the empirical 
work by Clarida, Gali and Gertler [4]- [5]. Specification (7.29) , although useful 
for some illustrative purposes, is not successful in capturing the observed per¬ 
sistence in the interest rates. Therefore, in the literature the following empirical 
model is usually specified: 


r* t = r + a\E t {nt +12 ~ tt*) + a 2 E t (y t - y* t ) (7.50) 


r t = (1 — p)r* t + pr t -i + v t (7.51) 

where r* is the target interest rate and ao is the equilibrium value for r*. The 
partial adjustment mechanism introduced by equation (7.51) is justified by the 
empirical observation of tendency of Central Banks to smooth interest rates 5 . 
Moreover a constant target rate of inflation is assumed in the estimated version 
of the rule. Combining equation (7.50) with (7.51) we derive the following set of 
orthogonality conditions: 


E t [r t - (1 - p) a 0 - (1 - p) E t TT t+ i 2 - a 2 (1 - p) E t (y t - y* t ) - pr t -1 | u t ] = 0 

(7.52) 

Where u t includes all the variables in Central Bank’s information set at the 
time interest rates are chosen. GMM can be used in this framework to estimate 

B See Goodfriend [16], 
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the parameters of interest do,di, a 2 and p. The J-test for the validity of over¬ 
identifying restrictions can then be used to assess if the simple specification of 
the monetary policy rule in (7.52) omits important variables which in fact enter 
the Central Bank rule. Obvious candidates for the role of omitted variables are 
monetary aggregates, foreign interest rates, long-term interest rates, exchange- 
rate fluctuations and possibly stock-markets overvaluation (do Central Banks 
care of “irrational exuberance” ?). Moreover the estimation of parameters of 
interest allows some relevant consideration on monetary policy. In fact, given 
(7.50) , it is possible to write an equilibrium relation for the real interest rate as 
follows: 


rr * = rr + ( fll - 1) E t (7r t+12 - tt*) + a 2 E t (y t -y* t ), (7.53) 

where rr is the equilibrium real interest rate, independent from monetary policy. 
Equation (7.53) illustrates the criticale role of parameter oq. If oq > 1 the target 
real rate is adjusted to stabilize inflation, while with 0 < oq < 1 it instead 
moves to accomodate inflation: the Central Bank raises the nominal rate in 
response to an expected rise in inflation but it does not increase it sufficiently 
to keep the real rate from declining. Taylor [33] and Clarida, Cali, Gertler [5] 
have shown that 0 < oq < 1 are consistent with the possibility of persistent, self- 
fulfilling fluctuations in inflation and output. Therefore the value of one for oq is 
an important discriminatory criterion to judge Central Bank behaviour. Clarida, 
Cali and Gertler show that in the pre October 1979 period the FED rule features 
rules oq < 1, while the post October 1979 period features oq > 1. Finally, it is 
possible to use the fitted values for the parameters do, oq to recover an estimate 
of the Central Banks’ constant target inflation rate 7 r*. In fact, the empirical 
model does not allow separate identification of the equilibrium inflation rate and 
of the equilibrium real interest rate but it does provide a relation between them 
conditional upon do,and Qq. Given that do = f — di7T* and rr = r — ir*, we have 
then: 


rr — do 

di — 1 


(7.54) 


which establishes a relation between the target rate of inflation and the equi¬ 
librium real interest rate defined by the parameters do, and di in the policy 
rule. Clarida, Cali and Gertler [4] set the real interest rate to the average in the 
sample and use (7.54) to recover the implied value for 7T*. 

The database CGG contains monthly data for the US and German economy 
taken from DATASTREAM and from the database on US monthly data used 
in Leeper, Sims and Zha [22], which should enable replication of the reaction 
function estimated by the authors, as well as testing for a number of interesting 
overidentifying restrictions. The followingvariables for the sample 1979:1-1996:12 
are available: 
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GERCMR: German average (of the month) call money rate; 

GER10Y: redemption yield on German 10-year government bonds; 

GERCP: German consumer price index; 

GERINFTAR: Bundesbank announced inflation target (rate of medium term 
unavoidable inflation); 

GERM1: German Ml; 

GERM3: German M3; 

GERIP: German Industrial Production; 

PCM: IMF world commodity price index (in US dollars); 

SMNBR: US smoothed (by a 36-month moving average) non-borrowed reserves; 

SMTR: US smoothed (by a 36-month moving average) total reserves; 

TOTMKUS_DY_01: US stock market dividend-yield; 

TOTMKUS_PE_01: US stock market price-earning ratio; 

TOTMKUS_PI_01: US stock market price index; 

US10Y: redemption yield on US 10-year government bonds; 

USCP: US consumer price index; 

USDM: US dollar/ D.Mark exchange rate; 

USFDTRG: US Federal Funds target; 

USFF: US average Federal Funds rate; 

USIP:US industrial production; 

USLABCOSE: US unit labour cost; 

USM2SA: US M2; 

USMANHERA: US manufacturing hourly earnings; 

USOPERATE: US capacity utilisation rate. 

7.6.1 The estimation of a baseline policy rule for the FED 

We concentrate first on the US case, trying to replicate the results in Clarida, 
Gali and Gertler [4], A series of empirical problem must be solved in order to 
perform GMM estimation of the monetary policy rule. The first issue we take is 
the measurement of the output gap. Clarida, Gali and Gertler take deviation of 
the log of industrial production from a quadratic trend. This is easily obtained by 
taking the residuals of an OLS regression of the log of industrial production on 
a constant, a linear trend and a quadratic trend. Such measurement of the cycle 
would be correct only if the log of industrial production features a deterministic 
quadratic trend. To check robustness of the definition of the cycle to alternative 
de-trending methods we compare the original CGG proposal (USGAP1) with 
the difference between industrial production and an Hodrick-Prescott filter with 
penalty parameter set to 14400 (USGAP2) and with the demeaned capacity 
utilization rate (USGAP3). We construct USGAP1, USGAP2, and USGAP3, on 
the sample 1981:10 1997:12, as we would like to start estimation of the policy 
rule from 1982:10 (the beginning of the interest rate targeting regime). We report 
the alternative measures of output gaps in Figure 1. 
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USGAP1 USGAP2 - USGAP3 


Fig. 7.1. Alternative measures of US output gap 


We note that the three different measure do not show evident discrepancies 
as far as the location of the turning points in the cycle is concerned up to 1990, 
from 1990 onwards USGAP1 signal a persistent recession, not shared by the 
other two measures. Obviously such difference does show up in a corresponfing 
difference in policy rates. Orphanides [26] has considered extensively the problem 
of measuring the output gap to show that different behaviours by the Fed in the 
course of the seventies and the eighties can be explained by different measures 
of the output gaps rather than by different parameters in the reaction function. 
To keep our results comparable with those of Clarida, Gali and Gertler, we keep 
USGAP1 as the relevant measure of the gap, checking robustness to different 
detrending choices could be an interesting exercise. 

The second empirical problem is the choice of the instruments. Here we follow 
CGG by taking as instruments the constant, the first six lags, the ninth and the 
twelvth lag of output gap, the first six lags, the ninth and the twelwth lag of the 
federal fund rate, the first six lags, the ninth and the twelvth lag of inflation, 
the first six lags, the ninth and the twelvth lag of the log IMF commodity price 
index. 
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We then implement estimation by GMM, using the correction for heteroscedas- 
ticity and autocorrelation of unknown form with a lag truncation parameter of 
12 and choosing Bartlett weights to ensure positive definitess of the estimated 
variance-covariance matrix. The results reported in Table 3 are obtained by im¬ 
plementing GMM in E-Views. 

Sample (adjusted): 1982:10 1996:12, 171 observations 
No prewhitening 
Bandwidth: Fixed (12) 

Kernel: Bartlett 

Convergence achieved after: 78 weight matricies, 79 total coef 
iterations 

USFF= C(2)*USFF(-1) +(1-C(2))*(C(1)+C(3)*USINFL(+12) +C(4) 
*USGAP1) 

Instrument list: C USGAPl(-l) USGAPl(-2) USGAPl(-3) USGAPl(-4) 
USGAPl(-5) USGAPl(-6) USGAPl(-9) USGAP1(-12) USINFL(-l) 
USINFL(-2) USINFL(-3) USINFL(-4) USINFL(-5) USINFL(-6) 

USINFL(-9) USINFL(-12) USFF(-l) USFF(-2) USFF(-3) USFF(-4) 
USFF(-5) USFF(-6) USFF(-9) USFF(-12) DLPCM(-l) DLPCM(-2) 
DLPCM(-3) DLPCM(-4) DLPCM(-5) DLPCM(-6) DLPCM(-9) 

DLPCM(-12) 


Table 3: 


Coefficient 

Estimate 

Std. Error 

t-Statistic 

Prob. 

C(l) 

2.87 

0.99 

2.90 

0.0038 

C(2) 

0.92 

0.012 

73.82 

0.0004 

C(3) 

1.73 

0.25 

6.87 

0.0000 

C(4) 

0.66 

0.10 

6.60 

0.0000 


R-squared 0.98 
Adjusted R-squared 0.98 
S.E. of regression 0.28 
Durbin-Watson stat 1.06 


Mean dependent var 6.713957 
S.D. dependent var 2.191514 
Sum squared resid 13.74 
J-statistic 0.0611 


So we have an estimated a o = 2.87, an estimated ai = 1.73, an estimated 
a ,2 = 0.66, while p = 0.92. Estimates are in line with the one obtained by Clarida, 
Gali and Gertler, with ai > 1, altough our slightly different. Such difference 
could be explained by their choice of second order lag in the adjustment, while 
we restrict to first order dynamics. 6 

The statistic for the validity of instruments, distibuted as a y 2 with 29 degree 
of freedom (33 instruments for 4 parameters) takes the value of 10.45 (0.0611*171 

^Checking this empirically could be a useful exercise 
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as the reported statistic in E-Views is divided by the number of observations) 
and does not reject the null of validity of instruments. If we follow the practice 
suggested by Clarida, Gali and Gertler to derive an estimate for the inflation 
target by using the estimated parametrs and the average real interest rate over 
the sample as a proxy for the real equilibrium interest rate, we get a point 
estimate of 0.5 with a rather wide confidence interval (as the 95 confidence 
interval for a o spans 0.89-4.85). Overall the rule is rather successful in explaining 
the Fed behaviour as illustrated in Figure 2, where we report observed policy 
rates and the 95 per cent confidence interval from our estimated equation. 



83 84 85 86 87 88 89 90 91 92 93 94 95 96 


Fig. 7.2. Observed US policy rates and the 95 per cent confidence interval from 
the estimated policy rule 


7.6.2 Does the Fed care for the long-term interest rate ? 

Within the GMM framework it is rather easy to check the importance of omitted 
variables in the policy rule. In fact if there are important omitted variables from 
the policy rule, for such variables the orthogonality condition should be violated 
and the test for the validity of instruments should then reject the null hypothesis. 
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There is a rather wide literature concentranting on the importance of long-term 
interest rates for the FED explicitly related to their signalling role for “inflation 
scares”. As pointed out by Goodfriend [16], the behaviour of long-term interest 
rate could be informative on agents expectations for inflation and on the effects 
of monetary policy on such expectations. Campbell (1995) concentrates on the 
collapse of bond price in 1994 relating it to movements in the term premium 
generated by a rise in expected inflation, not matched by any movement in the 
same direction in actual inflation. Looking at the 1994 data we see clearly that 
the Fed reacted lately to the increase in long-term interest rates and it took sev¬ 
eral tightening steps in the target federal funds rate to convinve markets of the 
Central Bank determination in fighting inflation. In fact only after several tight¬ 
ening movements in the policy rate the long-term interest rate started reverting 
its upward trend. All this discussion show that there are good theoretical and 
policy reasons for the Central Bank to monitor long-term interest rates, and the 
omission of long-term interest rates from the rule seems an obvious candidate for 
putting our testing procedure at work. We then re-estimate the base-line model 
by including the level of contemporaneous long-term interest rates in the set of 
instruments. The following results are obtained: 

USFF= C(2)*USFF(-1) +(1-C(2))*(C(1)+C(3)*USINFL(+12) +C(4) 
*USGAP1) 

Instrument list: C USGAPl(-l) USGAPl(-2) USGAPl(-3) USGAPl(-4) 
USGAPl(-5) USGAPl(-6) USGAPl(-9) USGAP1(-12) USINFL(-l) 
USINFL(-2) USINFL(-3) USINFL(-4) USINFL(-5) USINFL(-6) 

USINFL(-9) USINFL(-12) USFF(-l) USFF(-2) USFF(-3) USFF(-4) 
USFF(-5) USFF(-6) USFF(-9) USFF(-12) DLPCM(-l) DLPCM(-2) 
DLPCM(-3) DLPCM(-4) DLPCM(-5) DLPCM(-6) DLPCM(-9) 
DLPCM(-12) US10Y USlOY(-l) 


Table 4: 


Coefficient 

Estimate 

Std. Error 

t-Statistic 

Prob. 

C(l) 

4.23 

1.10 

3.84 

0.0002 

C(2) 

0.95 

0.007 

120.63 

0.0000 

C(3) 

1.48 

0.27 

5.37 

0.0000 

C(4) 

0.86 

0.11 

7.47 

0.0000 


R-squared 0.98 
Adjusted R-squared 0.98 
S.E. of regression 0.27 
Durbin-Watson stat 1.17 


Mean dependent var 6.713957 
S.D. dependent var 2.191514 
Sum squared resid 12.69 
J-statistic 0.067 


The point estimates of the parameters are slightly modified but the tests for 
validity of instruments does not reject the null (0.067 * 171 = 11.45). In the light 
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of this evidence we can conclude that the long-term interest rate affects the Fed 
behaviour as a leading indicator for future inflation but not as an independent 
argument of the monetary policy rule. 

7.7 Interest rate rules and Central Banks’ preferences 

Monetary policy rules like those we have so far considered are empirically suc¬ 
cessful and useful to show how the GMM methodology is applied. However, they 
are not in line with our introduction to the GMM methodology in that they are 
not derived explicitly from an intertemporal optimization problem and therefore 
no deep parameters describing Central Banks’ preferences are identifiable. In fact 
it is perhaps surprising that the GMM methodology has been used to estimate 
reaction functions, while the optimization problem of the Central Banks provides 
first order conditions which are instead a more natural object of GMM estima¬ 
tion. Following Svensson [31], we consider the simplest possible version of the 
inflation targeting problem. The Central Bank faces the following intertemporal 
optimisation problem: 


where: 


OO 

Minimize E- j s ^6 l L t+ j 
i =o 



7T*) 2 + Xxl 


(7.55) 


(7.56) 


where E t denotes expectations conditional upon the information set available 
at time t , 6 is the relevant discount factor, 7T t is inflation at time t , 7T* is the 
target level of inflation, x represents deviations of output from its natural level, 
A is a parameter which determines the degree of flexibility in inflation targeting. 
When A = 0 the Central Bank is defined as a strict inflation targeter. As the 
monetary instrument is the policy rate, i t , the structure of the economy must be 
described to obtain an explicit form for the policy rule. We consider the following 
specification for aggregate supply and demand in a closed economy: 


x t +i = P x x t - f3 r (i t - Et-jrt+i - r) + uf +1 (7.57) 


7T t+ i =7r t + a x x t + u s t+1 (7.58) 

As shown in Svensson [30], the first order conditions for optimality may be 
written as follows: 
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dL 

di t 

k 


(E t TT t+2 ~ 7T*) 

S\k 

A + 8o? x k 


A 

Sa x k 


EtXt+i 


(7.59) 

(7.60) 


Note that (7.59) deliver the set of orthogonality conditions, which consti¬ 
tute the natural object for GMM estimation. Joint estimation of (7.58) , (7.57) 
and (7.59) allows identification and estimation of the parameters describing the 
structure of the economy and of the parameters describing Central Banks’ pref¬ 
erences. Alternatively (7.58) , (7.57) and (7.59) , can be used to derive an interest 
rate rule. In fact, by substituting from (7.58) in (7.57) we obtain: 


E t TT t+ 2 = EfTYt+l + Ct x [f3 x X t - f3 r (i t - EtTTt+1 - r)] (7.61) 

and by substituting (7.61) in (7.59) we derive a standard interest rate rule: 


H = r + TT’ + ( 1 + a *Pr j (£J t7rt+1 _ 7T*) + 


Px 


(x x fd r 

A 1 


-x t 


~E t x t+ i 


f3 r 8a x k a x f3 r 

A number of comments on this rule are in order: 


(7.62) 


• If the rule is estimated as a single equation, then the fitted parameters 
are convolutions of the parameters describing Central Banks preferences 
(ir*, A, 8) and of those describing the structure of the economy (a x , j3 r ,j3 x , r) 
Thus the estimated parameters in the interest rate rules are not “deep” in 
the sense of Lucas (1976). 

• As the structure of the economy cannot be identified from the estimation of 
the rule only, it is impossible to assess if the responses of Central Banks to 
output and inflation are consistent with the parameters describing the im¬ 
pact of the policy instrument on these variables. Note, for example, that the 
estimation of an interest rate rule relating the policy rate to the output gap 
and to the deviation of expected inflation from target does not help to dis¬ 
tinguish a strict inflation targeter (A = 0 , in the terminology of Svensson) , 
from a flexible inflation targeter (A > 0). 

• Econometric identification of the rule requires the timing assumption that 
the Central Bank can set policy rates in response to contemporaneous 
macro variables in the economy, but policy rates do not have a contempo¬ 
raneous impact on those variables. This assumption is commonly used to 
identify VAR models of the monetary transmission mechanism. 
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• In order to make (7.62) consistent with the data, the rule has been inter¬ 
preted as delivering “target” interest rates, and a sluggish adjustment of 
actual to target rates has been imposed (Clarida, Gali and Gertler, 1997). 
Direct estimation of the policy rule does not allow to identify a structure of 
Central Bank’s preferences which is consistent with interest rate smooth¬ 
ing. 

• There is only one empirical implication of the rule which can be confronted 
with the data independently from the identification of the parameters of in¬ 
terest, namely whether the parameter describing the reaction of policy rates 
to a gap between expected and target inflation is larger than one. In fact 
a monetary policy which accommodates changes in inflation, QE t ^ t+1 < 1, 
will not in general converge to the target rate 7T*. This empirical predic¬ 
tion is the one which has attracted most of the discussion on estimated 
monetary policy rules (See again Clarida, Gali and Gertler, 1997). 

To provide a better mapping from Central Banks’ behavior to their prefer¬ 
ences a strategy, closer to the spirit of intertemporal optimisation, seems more 
appropriate. First, estimate the structure of the economy to identify the param¬ 
eters of the aggregate supply and demand functions. Second, estimate the Euler 
equation for the solution of the intertemporal optimisation problem to identify 
Central Banks preferences. In this step (and in reference to the simple example 
analyzed above), given the knowledge of o^and f3 r , we can identify directly, 
from the estimation of the first order conditions (7.59), the A and 7T* associ¬ 
ated to each assumed value of the discount rate, 6 . Third, test if the monetary 
policy rule consistent with the structure of the economy and the preferences of 
the Central Bank matches the actual behavior of policy rates. This strategy has 
been followed by Favero and Rovelli [12], whose empirical investigation leads to 
select a strict inflation targeting with real interest rate smoothing (with esti¬ 
mated relative weigths of about four to one) as the best model to describe the 
Fed behaviour in the eighties. 
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INTERTEMPORAL OPTIMIZATION AND CALIBRATION 

8.1 Introduction 1 

The calibration approach takes intertemporally optimized model to the data 
to answer specific economic questions. The essence of this methodology can be 
described in six steps, see Canova and Ortega(1997) and Cooley(1997): 

• formulation of an economic question 

• selection of a model design which bears some relevance to the question 
asked 

• choice of functional forms for the primitives of the model to find a solution 
for the endogenous variables in terms of the exogenous variables and the 
parameters 

• choice of parameters and stochastic processes for the exogenous variables 
and simulation of paths for the endogenous variables of the model 

• selection of a metric to compare the outcomes of the model relative to a 
set of ’’stylized facts” 

• policy analyses, if required 

We illustrate the methodology by considering the question of the relative 
relevance of supply-side technological shocks and monetary policy in determin¬ 
ing fluctuations in macroeconomic variables. We do so by introducing money in 
the utility function and by then modifying the traditional artificial economies 
considered in the calibration camp in which money is typically a ’’veil”. Ob¬ 
viously, there are no logical objections against the use of intertemporally opti¬ 
mized models in which monetary policy is not ineffective as a consequence of 
some market imperfections or some ’’cash in advance” constraints. In fact, the 
diffusion of calibrated models featuring short-run effectiveness of monetary pol¬ 
icy has recently increased in the literature. We consider a model, proposed by 
McCallum-Nelson(1997), which is directly comparable with models used in pre¬ 
vious sections of the book since it can be re-parameterized as a forward-looking 
IS-LM framework. Equipped with this model design, we illustrate the main steps 
of the calibration approach. In describing step v) we show how the impulse re¬ 
sponse derived from the artificial economy differ from the impulse responses 
described in Chapter 6 and representing the benchmark stylized facts on the 
effect of monetary policy on macroeconomic variables. Lastly, we discuss policy 
analyses. 

Whis chapter has been jointly written with Marco Maffezzoli 
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8.2 Model design 

Our model design is taken from McCallum and Nelson (1999a). The econ¬ 
omy is inhabited by a large number of infinitely-living identical price¬ 
taking households. They can be aggregate into a single representative 
household, whose preferences are summarized by the following intertem¬ 
poral utility function: 


Ut 


Et 


E 


i-M 



( 8 . 1 ) 


where p £ (0, oo) is the reciprocal of the intertemporal elasticity of sub¬ 
stitution, j3 £ (0,1) is the intertemporal discount factor, Ct £ R + is the 
real consumption level at date f 2 , £ R + is the stock of real money 

balances held at the start of period t, and 9 £ (0, oo) is the relative weight 
of real money balances in the utility function. As stated in McCallum and 
Nelson (1999a, p. 15), 

“... the rationale for the inclusion of ^ is of course that holdings 
of the economy's medium of exchange provide transaction services that 
mduce ... (the) resources needed in 'shopping' for the numemus distinct 
consumption goods whose aggmgate is repiesented by .Ct...”. 

Each household produces a single good using the following constant-returns- 
to-scale Cobb-Douglas production function: 


Y t = a ( At“ {Z t n t f (8.2) 

with a £ (0,1), where K t £ R + is the stock of capital held by the household 
at date t, 0% = 7 * £ R ++ is labor-augmenting exogenous technical progress, 
? i t £ [0,1] is the labor input, and a t £ R is a stochastic measure of Total Factor 
Productivity (TFP). We assume that the natural logarithm of at. follows a first- 
order univariate AR process: 

In («t+i) = (1 — p) In (a) + pin (a t ) + e“ (8.3) 

where a is the unconditional mean, p £ (0,1) the persistence parameter, and 
e t s N (0,C7„) the i.i.d. innovation. By adequately choosing units, we impose 
a = 1. 

Each household inelastically supplies one unit of labor to a competitive labor 
market, from which the same household as a producer purchases the labor input 
at the real wage rate W t . Furthermore, a market for a one-period government 
bonds exists. These bonds pay between date t and t + la real interest rate equal 
to r t . 

2 Tilded variables refer to individual quantities. Non-tilded variables, instead refer to aggre- 
gate per household quantities. 
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The representative household’s budget constraint has the following specifica¬ 
tion: 


Kt+i + (1 + 7r t+ i) + B t+1 = (l-S)K t + ^ + (l+ r t ) B t (8.4) 

-nt+i 

+a t K\- a ( Z t n t ) a -C t - (+ -l )W t - V t , 


where B t £ R + is the number of real government bonds held at the beginning 
of period t, P t is the money price of goods, Ht+i = (Pt+i ~ Pt) /Pt is the inflation 
rate, V t is a lump-sum tax levied on the household, and 6 is the depreciation rate. 

The presence of exogenous technical progress introduces a non-stationary 
component in the system. This implies that the model does not converge to a 
steady-state in the long-run. The original non-stationary model can be trans¬ 
formed into a stationary one simply by normalizing all equations with respect to 
Z t 3 . Normalization of equation (8.1) delivers: 


U t = E t 


oo 


E 


o 


i 


1 — jj, 


c t+t +9(mt+i) 1 


(8.5) 


where (3 = /+ 1 c t = Ct/Z t , and m t = M t / ( PtZ t ). To ensure finiteness of our 
objective function, we impose that f3 < 1. Similarly, normalizing (8.4) we obtain: 

-ykt+i + (1 + 7r t+ i) 'jfht+i + 7&t+i = (1 - S)k t + rh t + (1 + r t+1 )b t (8.6) 

+a t kj~ a nf - c t - (n t -l)w t - v t , 

where small letters identify normalized variables. 

8.2.1 Households 

The representative household solves a stochastic optimal control problem, with 
consumption and labor as control variables, and capital, money, and bonds as 
endogenous state variables. Formally, (8.5), evaluated at date 0, is maximized 
subject to (8.6) and the initial conditions for all endogenous state variables. 

In order to obtain the first order conditions, we form a Lagrangian in expec¬ 
tations: 


L = E t J2f3 l 


4=0 


~1—U 

C ■ ■ 

4+4 


• Oty) i ^ 

u,n t+i 


1 — n 




£+ 4 * 7+4 


^i+4 — (1 — £) ^£+4 + 'Wlt+i + (1 + r t+i) bt+i + a t+ifc 


1-a—a 
t+i '4+4 


-Ct+i - (n t +i - 1) w t +i - v t+ i - q/ct+i+i - (1 + ^t+i+i) r ym t+i+1 ~ l b t+i+i 


3 Since 7 is exogenous, the normalization can be easily reversed: the original and the trans¬ 
formed models are isomorphic. Any qualitative conclusion we may reach studying normalized 
model can be immediately extended to the original one. 
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where A t +i is a vector of present-value costate variables, and derive it with 
respect to Ct+i, nt+ii k t+ i, rri t+ i, b t+ i and \+i- The first order conditions for 


optimality are: 

E t (<££) = Et (A t+i ) (8.7) 

E t = E t (w t+i ) ( 8 . 8 ) 

S3E t A t+ i (1 — a) a t+ i/c t+ “n“ +i + A t+ i (1 — ^) = jEtXt+i-i (8-9) 
f3E t ^ 6m t _^ + A t+ ^ = E t [1 +7r t+ j] 7 A t+i-i (8.10) 

f3E t ^A t+ i (1 + r t+ i)^j = r yE t X t+ i-i (8.11) 

T^t+i+i + (1 + 7r t+i+ 1 ) l m t+i+i + T^t+i+i = (1 — <*0 k t +i + rn t+ i ( 8 - 12 ) 


+ (1 + r t+i ) b t+i + a t+i kl + ^n^ +i 

Ct+i {^t+i 1) ^t+i ^ t+i • 

Conditions (8.7)-(8.12), together with the following transversality condition: 

lim Eq (3 \ t +i [kt+i+i + m t +i+i + b t +i+i\ =0, (8.13) 

are necessary and sufficient for the household’s problem, ie. they completely 
characterize the sequence of probability measures that solve the household’s 
stochastic optimal control problem. Given that the state variables are always 
positive, we may rewrite (8.13) as three separated transversality conditions: 

lim E 0 p \ t+i k t+i+1 = 0, lim E 0 /3 \ t+i m t+ i +1 

i —>00 i —>00 L 

lim Eq f3 \t+ibt+i+i = 0. 

i —>00 L 


0, (8.14) 
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8.2.2 Government 

The government’s budget constraint, written in per household terms, is: 

-v t = (1 + 7r t+ i) 'ynit+i -m t + 7& t +i - (1 + r t ) b t , (8.15) 

where m t is the per-household real money supply and b t is the per-household 
supply of government bonds. 

The dynamics of government bonds has to satisfy also the so-called No-Ponzi- 
Game (NPG) condition : 


lim E t 

s—> oo 


Rd+O') 1 bt+s+1 

j=t 


> 0 . 


(8.16) 


The NPG condition states that the present value of government bonds cannot 
be strictly negative in the long-run. In other words, it rules out the possibility 
for the government to repay existing debt contracting always issuing new debt. 
The (intratemporal) budget constraint (8.15) together with the NPG condition 
(8.16) forms an intertemporal budget constraint. 

For the sake of simplicity, we impose that normalized lump-sum tax are con¬ 
stant over time, ie. that v t = v for each t. Furthermore, we assume that the 
nominal money stock grows at an exogenously given rate rj t \ 


t 

M t = n*Mo. (8.17) 

i=0 

Finally, we assume that the logarithm of rj t follows a stationary AR process: 


In (Vt+i) = (1 - 0 In (v) + C In (%) + ( 8 - 18 ) 

where rj is the unconditional mean, £ (E (0, 1 ) the persistence parameter, and 
e t ~N (0, <t^j) the i.i.d. innovation. 

Equation (8.17) can be interpreted as a “degenerated” version of the central 
banker reaction function, since monetary policy, ie. the growth rate of nominal 
money balances, does not depend on any endogenous variable 4 . In this frame¬ 
work, then, monetary policy shocks can be modeled as unexpected shocks to the 
exogenous growth rate of nominal money balances. In other words, they coincide 
with the i.i.d. innovations in (8.18). 

4 Christiano, Eichenbaum and Evans(1998) bridge the gap between the VAR based empirics 
on the monetary transmission mechanism and the simple rules like this considered in theoretical 
models by showing that any equilibrium allocation generated under an endogenous policy 
rule can be replicated by an adequately parameterised recursive exogenous policy rule. In 
other words simplified rules like the one considered here share impulse response functions and 
stochastic properties with endogenous policy rules estimated in VAR models of the monetary 
transmission mechanism 
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8.3 Dynamic equilibrium 

To characterize a dynamic equilibrium we note that all households are identical, 
therefore in equilibrium individual and aggregate per-household quantities have 
to coincide; which implies that all tildes can be dropped. Furthermore, we impose 
the following conditions: 


, ~ M t 

n t =n t = 1, m t = m t = ——, 

r p t 


b t = b t , 


(8.19) 


where in (8.19) M t is the per-household nominal money supply. Equation (8.19) 
equates demand and supply for the labor input, the stock of nominal money 
balances, and the stock of government bonds. 

Combining (8.17) and (8.19) with the definition of inflation rate we obtain: 


1 + 7Tt+l 


Vt m t 
7 m t +1 


( 8 . 20 ) 


Two comments are in order here: 


1. The variable m t , a state variable from the household point of view, be¬ 
comes a forward-looking aggregate decision variable when considered from 
the aggregate point of view. The reason is the following. The nominal 
money supply is exogenous, while the demand for real money balances is 
endogenous. The price level P t has to equate supply and demand, ie. un¬ 
der rational expectations, has to satisfy the first order condition governing 
the accumulation of real money balances, equation (8.10). The price level, 
then, substitutes M t as an aggregate endogenous variable, with the differ¬ 
ence that P t is not a state variable, but a forward-looking variable that can 
be treated as a costate variable. The variable m t , then, is the ratio between 
M t 7~ 4 , an exogenous process, and P t . To stress this point, we may rewrite 
m t as 1/pt, where p t = rfPt/Mt is a normalized stationary variable. 

2. Since both the nominal money supply and the lump-sum transfers are 
exogenous, bonds have to counterbalance seignorage in order to keep the 
government budget balanced. The supply of government bonds is then, in 
some sense, exogenous too. More precisely, it is beyond the government 
control. The demand for government bonds is, however, still endogenous. 
The real interest rate r t has to equate supply and demand for bonds, ie. 
under rational expectations, has to satisfy condition (8.11). It also becomes 
an aggregate forward-looking decision variable. 

By substituting the government budget constraints and (8.20) in the house¬ 
hold first order conditions, we obtain a system of stochastic difference equations 
that fully describe a dynamic competitive equilibrium in our economy. We write 
the system by setting i=l in our first order conditions and by dropping, for the 
sake of simplicity, the optimality condition for labour supply 


c 


-n 

t 


= A t 


( 8 . 21 ) 
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E t [A t+1 [(1 - a) a t+1 k + 1 - 6]] = lx t (8.22) 

P 

M (9rf +1 + A t+1 ) = E t r] t X t (8.23) 

Et[X t+ i(l+r t+1 )} = Zx t (8.24) 

P 

'ykt+i = {l-S)k t +a t kl- a -c t (8.25) 


pb t +i = -—— + (l+r t )b t -v t . (8.26) 

Pt 

Furthermore, the transversality conditions (8.14) can be restated as: 


lim Eq( 3 X t kt+i = 0, lim Eq 

t - >oo t - >oo 


0 


t X f 


Pt+l 


= o, 


lim Eq 

t ->oo 


f3X t b, 


't+i 


= 0 . 


(8.27) 


Note that the last transversality condition in (8.27) implies the NPG condi¬ 
tion (8.16). Note furthermore that, as in Benassy (1995), the dynamics of con¬ 
sumption and investment is driven by real shocks only (consider equations 8.21, 
8.22, and 8.25: they represent a stand-alone Brock-Mirman model). The “real” 
world is in some sense completely separated from the “monetary” world. The 
dynamics of government debt, on the other side, is driven by both monetary and 
real shocks. This “separation” result is of course not robust, depending on our 
restrictive assumptions for the structure of preferences and the money creation 
process. 

In summary, equations (8.21)-(8.26), together with the initial conditions and 
(8.27), form a system of stochastic difference equations that completely describe 
the competitive equilibrium allocations for our economy. The solution to such a 
system is an infinite sequence of conditional probability measures that converge 
in the long-run to a invariant, or unconditional, distribution; in other words, it 
is a sequence {P t (c t ,p t , r t , X t , b t , a t , i] t , k 0 , b 0 , ao, Vo)}0j where each P t (•) repre¬ 
sents a probability measure on i?“, converging to P ( c,p , r, A, k, 6 , 1, 77) as t — > 00. 
Given the recursive structure of our system, a solution can be also seen as a set 
of aggregate decision rules for c t ,p t , r t , A t , k t+ i and b t+ 1 , expressed as functions 
of k t ,b t , a t and 7j t . 
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8.4 An IS-LM interpretation 

McCallum and Nelson (1999a) show that a version of the intertemporally op¬ 
timized model we have considered so far can be re-parameterized in terms of 
a traditional IS-LM framework. In fact, the addition of an expectational term 
is sufficient to make the standard IS function match a fully optimizing model, 
whereas no changes are needed for the LM function. 

To show this point, combine (8.23) with (8.21) and (8.24), obtaining: 


E t (Om^ + c t ^) — [1 + E t (7r t +i)] E t \c t+ \ (1 + n+i)] • (8.28) 

Following Sargent (1987, pp. 94-95), we can approximate 5 (8.28) with: 

E t (Om^) = E t (<££) {[1 + E t (7r t+1 )] [1 + E t (r t+1 )] - 1} (8.29) 

Equation (8.29) is equivalent to: 

E t (Om^) = E t (c^) E t (i t+1 ) (8.30) 

where i t +1 = (1 + 7 Tt+i) (1 + r t+ i) — 1 is the nominal interest rate between date 
t and t + 1. Furthermore, we can combine (8.21) and (8.22) to get: 

M[c; + \(l+r t+1 )\= 7 c^ (8.31) 


Consider now equations (8.30) and (8.31). The first one differs only by a random 
term from a standard LM function m t = LM (c t ,it), where the real money 
balances depend upon a transaction variable and an opportunity cost variable. 
The second one, instead, can be interpreted as an extended IS function by 
imposing a further assumption. If, as stated in McCallum and Nelson (1999a, 
pp. 7-10), we are able to approximate fluctuations in income with fluctuations in 
consumption (at least for business cycle purposes) , then we may substitute it in 
(8.31), and get an extended IS function of the form y t = IS [E t (yt+i) , Et (r t+ i)]. 
The previous IS function is non-standard since it incorporates expectational 
terms for both the income level and the real interest rate. This forward-looking 
aspect is usually absent in standard IS — LM analysis. 

8.5 Choice of parameters 

The theoretical framework developed in the previous sections is of limited use, 
at least from an empirical point of view, without the specification of a value for 
all deep parameters in the model. The choice of a particular parameterization, 
however, cannot be arbitrary, but needs some kind of empirical foundation. The 
literature on intertemporally optimized models has shown a clear preference for 
calibrating rather than estimating parameters of interest. To discuss this choice 
a brief comparative revision of the two approaches might be helpful. 

We summarize the estimation approach in Figure 8.1. 

B Given two random variables, x and y, we have that E (xy) = E (x) E (y) + Cov(x,y). 
Sargent (1987) approximates the conditional covariance term by zero. 
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The estimation approach would start from the assumption of the existence of 
a Data Generating Process (DGP) summarizing the true multivariate stochastic 
process that governs the observed macroeconomic variables. The unknown DGP 
would then be approximated by a statistical model, the reduced form, whose 
congruency is to be assessed on the basis of diagnostic tests. If the null hypothesis 
of congruency is not rejected, then the reduced form is directly available for 
forecasting. 

The estimation of deep parameters requires the specification of a structural 
model. Such model is usually identified by imposing further, testable, restrictions 
on the reduced form. If such over-identifying restrictions are not rejected, then a 
set of structural parameters is identifiable and, after estimation, the model can 
be used for simulation exercises. 

The LSE approach to estimation works rather nicely when applied to data- 
driven, dynamic specifications loosely related to theory, but it much less likely 
to succeed when applied to intertemporally optimized model of the kind we have 
considered in this chapter. Microeconomic foundations are usually obtained at 
the inevitable cost of simplification. Clearly, in numerous cases theoretical models 
are far to simple representations of reality (the partial super-neutrality of money 
that characterize our framework is in clear contrast with many stylized monetary 
facts discussed in the literature). Most likely, any formal test would conclude that 
the model is statistically false, if enough observed data were available . Such a 
result would take us to reject the whole analysis, preventing us from doing any 
kind of numerical exercise. As clearly stated by Eichenbaum (1995, p. 1609), 

” ... Since all models are wrong along some dimension of the data the 
classic Haavelmo (1944) program is not going to be useful in this context. 

We do not need high powered econometrics to tell us that models are false. 

We know that." 

A critical question arises at this point. Is the incapability to completely ex¬ 
plain the observed data a sufficient argument for rejection of an economic model? 
A researcher might be after all interested in establishing how far a simple model 
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like ours can go. A model can be statistically refuted because it simply does not 
catch any feature of the data; however, it can be refuted also because it performs 
poorly only in few dimensions, while performing well in others. The researcher 
may then be interested in actually identifying the dimensions in which the model 
is more at odds with the data. The hope is, of course, that such pieces of infor¬ 
mation may improve our understanding, and help us in building a better model, 
one that may actually be a congruent representation of the DGP . As suggested 
again by Eichenbaum (1995), 

”... What we do need are interesting diagnostic tools to help us under¬ 
stand the dimensions along which mis-specified models do well and the 
dimensions along which they do poorly...” 

The standard econometric approach is of little use from this point of view, 
since the formal tests often do not suggest any explicit alternative to the model 
under examination, providing no help in the theoretical respecification of the 
model. 

To summarize, then, the strong prior against the statistical truth of our 
model, together with the interest in studying anyhow its numerical properties, 
suggests to follow a different approach than the formal econometric one previ¬ 
ously described. The approach described here is known as calibration, and was 
introduced in macroeconomics 6 by Kydland and Prescott (1982). It is extensively 
described in Cooley (1997), among others. He states (p. 56) that 

” ...calibration is a strategy for finding numerical values for the parameters 
of artificial economic worlds...[it] uses economic theory extensively as the 
basis for restricting a general framework and mapping that framework 
into the measured data.” 

In other words, the aim of calibration is not to provide a congruent repre¬ 
sentation of the data, but simply to find values for the deep parameters of the 
model that are jointly compatible with the theory and the data in particular well 
specified dimensions. Figure 8.2 crudely summarizes the calibration approach. 

The main difference between calibration and standard econometrics lies in 
the bi-directional relationship among theory and measurement that characterize 
the former. In econometrics, this relationship goes only in one direction, from 
data to theory; the econometrician conditions the information set on available 
data, and searches for the most likely theoretical structure to have generated 
them. As stated in Cooley (1995, p. 60), 

”...the calibration approach ... views the appropriate data or measure¬ 
ments as something to be determined in part by the features of the theory. 


^Calibration has been initially used in computable general equilibrium modelsof public 
finance and international trade, as noted by Cooley(1997). The use of calibration methods is 
widespread among natural sciences as well, as remarked by Gallant(1995). 



278 


INTERTEMPORAL OPTIMIZATION AND CALIBRATION 



Fig. 8.2. Calibration approach 


First of all, a preliminary, a-theoretical, inspection of the data identifies some 
general stylized facts that any economic model should internalize. A classical ex¬ 
ample here is offered by Kaldor’s stylized facts on growth but, as we have shown 
in Chapter 6, structural VAR models identified via restrictions independent from 
predictions of specific theoretical models can be also used to this end. The the¬ 
oretical framework at hand, then, integrated by these observed stylized facts, 
provides the parametric class of models whose performance we want to evaluate 
(the neoclassical theoretical framework, together with Kaldor’s stylized facts, 
generates the standard neoclassical growth model characterized by long-run bal¬ 
anced growth). 

Once a particular model has been developed, it precisely defines the quantities 
of interest to be measured, and suggests how available measurements have to be: 
reorganized if they are inconsistent with the theory. For instance, the concept 
of investment in our model is a fairly broad one; since no government or foreign 
sectors are explicitly modeled, to obtain a measure of investment that matches 
our theoretical concept we need to reorganize the data, and sum up private fixed 
investment, private consumption of durable goods, government investment, and 
net exports . Furthermore, by assuming a ” Cobb-Douglas” production function^ 
we clearly identify Total Factor Productivity, one of the model’s two stochastic 
components, with the standard Solow residuals. Such series can be reconstructed 
from the series on output, labor and capital. 

Then, measurements are used to give empirical content to the theory, and 
in particular to provide empirically based values for the unknown parameters. 
They are chosen, according to Cooley (1997, p. 58), 

"...so that the behavior of the model economy matches features of the mea¬ 
sured data in as many dimensions as there are unknown parameters...." 

In other words, we need first to specify some features of the data for the model 
to reproduce; of course, these features have to be different from the ones under 
examination. In our case, the features we want to match are long-run features of 
the real and monetary variables, our main interest being the short-run cyclical 










CHOICE OF PARAMETERS 


279 


properties of the model . Then, we need to find some one-to-one relationship 
between these features and the deep parameters of the model. Finally, we invert 
this relationship, and find the parameters’ values that make the model replicate 
the observed features. 

From this point of view, calibration can be interpreted as a method of mo¬ 
ments estimation procedure that focuses on a limited parameters’ subset, setting 
only the discrepancy between some simulated and observed moments to zero. 
Christiano and Eichenbaum (1992) generalize this idea and propose a variant of 
Hansen’s (1982) GMM procedure to estimate and assess stochastic general equi¬ 
librium models using specific moments of the actual data. Other possibilities are 
described in Diebold et al. (1994), who minimize the discrepancy between the 
spectrum of the observed and simulated series at particular frequencies. These 
procedures are formal developments of the basic methodological approach, and 
share with standard calibration the focus on a limited set of previously selected 
moments, while standard econometric methods use in principle the whole avail¬ 
able information set, weighting different moments exclusively according to how 
much information on them is contained in the actual data, as for example in the 
maximum likelihood methods. 

Generally, not all parameters can be calibrated, simply because there are more 
unknown parameters than invertible relationships. A subset of them has to be 
left to more standard econometric techniques. This implies that formal statistical 
estimation and calibration are not perfect substitutes, but partial complements. 
For instance, by assuming a constant long-run growth rate we identify the ex¬ 
ogenous growth component with a linear trend in logarithms, that can be easily 
estimated by regressing a time trend on the natural logarithm of output, using 
ordinary least squares. Note however that, in the traditional approach to cali¬ 
bration, these estimates are used quite differently than in econometrics. In the 
former, the focus is limited to the parameter’s point estimates, the stochastic na¬ 
ture of these estimates is generally ignored, and the estimation procedure is seen 
more as a purely mechanical device. Some more recent studies, as Eichenbaum 
(1991), stress the importance of taking into account the parameters’ variability, 
and provide different ways to map the parameters’ uncertainty into the predicted 
moments’ uncertainty. For instance, Canova (1994) and DeJong et al. (1994) sim¬ 
ulate and evaluate the basic real business cycle model by drawing both the shocks 
and the parameters from a-priori densities. 

There is a diffused practice in the calibration literature to borrow parameter 
values from previous studies, often just for comparison purposes. The abuse of 
this practice is unfortunately widespread. Note that it is admissible only if the 
measurements used in these studies refer to the same theoretical concepts, and if 
the way they were reorganized is completely compatible with our needs. We will 
follow this practice too, and borrow many parameter values from Cooley and 
Prescott (1995), Cooley and Hansen (1995), and Gavin and Kydland (1999). 
Cooley and Prescott carefully calibrates to the US economy a standard real 
business cycle model which is perfectly compatible with the real side of our 
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economy; Cooley and Hansen estimate the stochastic properties of money growth; 
Gavin and Kydland study the long-run properties of money’s income velocity. 
Both studies focus on the monetary aggregate defined as Ml. 

Once a parameterization is available, we can simulate the model and perform 
many different kinds of numerical exercises. The results are then evaluated, and 
answers to the main questions are provided. For instance, the ability of the model 
to reproduce some particular features (of course, the ones that are different from 
those used to calibrate it) of the data can be judged. The conclusions drawn may 
then stimulate further theoretical developments. 

The metric chosen to compare the observed properties and the simulated 
ones is a critical issue. In the traditional calibration procedure, an informal, 
’’aesthetic”, metric is used. In the own words of Kydland and Prescott (1996, p. 
75), 

” ... the sampling distribution of this set of statistics can be determined 
to any degree of accuracy for the model and compared with the values of 
the set of statistics for the actual economy 

No formal measure of the distance between simulated and observed properties 
is provided, and for sure not a statistical one. This informal evaluation procedure, 
however, is perfectly in line with the overall methodological approach previously 
outlined, being more useful to compare the relative performances of competing 
model, than to evaluate a model’s ability to reproduce reality. Nonetheless, an 
increasing attention is payed in the current literature to more formal evaluation 
procedures; see, for instance, Canova (1995), Diebold et al. (1994), Smith (1995), 
and Watson (1993). For a recent survey, see Canova and Ortega (1996). 

8.6 Calibration 

Going back to our model, the complete list of parameters we have to pin down 
is the following: the intertemporal elasticity of substitution, (3, the intertempo¬ 
ral discount factor, 6, the relative weight of real money balances in the felicity 
function, ct, the technology coefficient, 6, the depreciation rate, 7 , the long-run 
growth rate, 77 , the unconditional mean of money growth, p, the persistence pa¬ 
rameter for TFP, and the persistence parameter for money growth, lastly o 2 a 
and <7^ are the variances of shocks to TFP and money growth respectively 7 . 

Some of these parameters can be estimated on available data, while others 
can be recovered from available microeconometric evidence. In particular, the 
long-run quarterly growth rate 7 is estimated by fitting a linear trend to the 
logarithm of quarterly GDP; Cooley and Prescott (1995) obtain 7 = 1.004. The 
literature provides a whole set of empirical estimates of the elasticity of inter¬ 
temporal substitution, as discussed in Kocherlakota (1996); most authors agree 

7 The adopted solution method, and in particular the certainty equivalence assumption, 
implies that the stochastic properties of the model will not depend on the absolute values of 
the standard deviations, but may only depend on their relative size. 
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on a figure that lies between 1 and 5, so we chose the standard value of 2 for our 
experiments. 

A large subset of parameters, containing (3, 6, a , and 6, are left for our cal¬ 
ibration exercise. As already anticipated, we choose values for these parameters 
that make the model reproduce some long-run features of actual US data. First 
of all, then, we have to find out what the long-run features of the model are. 

The Cobb-Douglas technology implies a labor share in income constant and 
equal to a. Cooley and Prescott (1995) carefully reconstruct a consistent measure 
of total income and capital income, obtaining a long-run capital share equal to 
0.4. We borrow their result and choose a = 0.6. 

We impose then a certainty equivalence assumption, assuring this way that 
the unconditional mean of the invariant distribution to which the solution tends 
in the long-run is equal to the steady-state of the deterministic version of our 
system. The steady-state of this deterministic system can be easily computed 
dropping all expectations and time indices from (8.21)-(8.26): 



(8.32) 

(8.33) 

(8.34) 

(8.35) 

(8.36) 

(8.37) 


Equations (8.32)- (8.37) define implicitly the steady-state values for the con¬ 
trol, endogenous state and costate variables. Furthermore, we may easily obtain 
a closed form solution for the steady-state capital-output ratio, the income ve¬ 
locity of money, the consumption share in total income, and the government 
bond-output ratio. From (8.37) we get: 


k (3 (1 — a) 

V 7 — /? (1 + <5) 

Combining (8.32) and (8.34), we obtain: 

£ y_ = ( V~p \ " 
ym \ @6 ) 


(8.38) 


(8.39) 


Solving (8.36) for the investment-capital ratio delivers: 
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finally, from (8.37): 


i 

k 


= 7—1 + S, 


b 

V 


v 

V 


(i — v) 


m 


lP~ 

1~P 


(8.40) 


(8.41) 


Combining (8.38) and (8.40) we derive an expression for the investment share 
i/y = (i/k) ( k/y ) and indirectly for the consumption share in income, c/y = 
1 - i/y. 

Empirical estimates of the long-run capital-output ratio, the income veloc¬ 
ity, and the consumption share are readily available. In particular, Cooley and 
Prescott (1995) obtain a long-run quarterly capital-output ratio equal to 13.28 
and a consumption share equal to 0.75; Gavin and Kydland (1999) report a 
long-run Ml income velocity equal to 5.3. 

Manipulating (8.38)-(8.41), we can express the parameters (3, 6, 6, as a func¬ 
tion of these observable long-run properties: 


S = 1 - 7 + 



y 

k ’ 


(8.42) 


P 


_7_ 


(8.43) 


0 = 


v-P 

P 


y m 
c y 


(8.44) 


The implied values are S = 0.015, (3 = 0.989, 9 = 1.22. 

Finally, the stochastic process driving TFP can be estimated by fitting an 
AR model on the standard Solow residual. Symmetrically, the stochastic process 
for the money growth rate can be estimated by simply fitting an AR model 
on the logarithm of the actual Ml growth rate. Again, Cooley and Prescott 
(1995) obtain p = 0.95 and o 2 = 0.007, while Cooley and Hansen (1995) obtain 
T] = 1.013, C = 0.49 and a^ = 0.009 . Equipped with these parametric values we 
can solve the model and evaluate its performance against the data. 


8.7 Model Solution 

To obtain the decision rules, we apply the solution procedure originally proposed 
by King, Plosser and Rebelo (1988, KPR). As anticipated in the previous sec¬ 
tion, we start by imposing certainty equivalence. This step provides us with a 
point in R ^_, the deterministic steady-state, that corresponds to the system’s un¬ 
conditional mean. Assuming that the system’s dynamics takes place in a small 
ball around the steady-state, we may approximate the non-linear system with 
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a first-order Taylor expansion centered on the steady-state itself. Actually, we 
log-linearize the system, obtaining a linear system of expeetational difference 
equations that explains the variables’ percentage deviations from the determin¬ 
istic steady-state. We then solve this linear system with the standard Blanchard- 
Kahn(1981) algorithm. Once an approximated solution is available, we perform 
a number of numerical exercises, as impulse response analysis and Monte Carlo 
simulations, and evaluate the results. As a consequence, the unconditional mean 
of the invariant distribution to which the solution tends in the long-run corre¬ 
sponds to the deterministic steady-state. Then, we linearize the system around 
the steady-state and solve it with the Blanchard-Kahn algorithm. 

8.7.1 Log-linearization 

Consider a deterministic version of the first order conditions: 


Cf M = Af, 

(8.45) 

‘f+i [(1 — a ) fd+i^’t+i + 1 — <*>] = 2-Af, 

P 

(8.46) 

p„n | \ _ Pt+i \ 

VPt + i + A f+ i — ~ A t , 

p Pt 

(8.47) 

> 

+ 

+ 

+ 

II 

Tbuil-T 

V- 

(8.48) 

jk t+1 = (1-6) k t + akj “ - c { , 

(8.49) 

7 ^+i= ?f +(l+n)6f v. 

Pt 

(8.50) 


We linearly approximate conditions (8.45)-(8.50) with a first-order Taylor ap¬ 
proximation around the deterministic steady-state, expressing the approximated 
conditions in percentage deviations from the steady-state itself. 

Consider equation (8.45), and rewrite it as 8 : 

e-fc=e\ (8.51) 

where for a generic variable ay, x t = In (at). The first-order Taylor approximation 
of (8.51) around the logarithms of steady-state values c , A , is equal to 9 : 

s Along an optimal path both Ct and Af are strictly positive. 

9 The first order Taylor expansion of a non-linear function / (,'r) around a point ao is given 
by f (a) = Af (xo) (a — Xo) + e (x) . This implies that / (a) t Af (xo) (x — ap). 
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(ct-c) = e x (A t - a) . (8.52) 

Since in steady-state exp (— fie) = exp (a^ , and since x t —x = In (x t /x), condition 
(8.52) can be simplified as 10 : 

-fic t = A t , (8.53) 

where x t = ln(a; t /a;). Equation (8.53) is a log-linearized version of condition 
(8.45), expressed in percentage deviation from the steady-state, since x t is ap¬ 
proximately equal to (x t — x) /x. 

Consider now condition (8.46), and rewrite it as: 

e Xt+1 [(1 - a) e St+1 e _afet+1 + 1 - tfl = Ze Xt . 

1 J /? 

The first-order Taylor approximation of (8.52) around the steady-state is: 
le x (A t - a) = e* (A t+1 - a) [(1 - a) + 1 - + 

ft 

+ (1 — a) e x e~ ak e a (5 t+1 — a) — e x e~ ak e a (1 — a) a (fc t+ i — k'j , 
which is equivalent to: 

%\t = Af+i (1 — ol) — + 1 — S + (1 — a) — at+i — (8.54) 

/3 L fc J K 

-a (1 - a ) |^t+i 

We know that in steady-state: 

\f, x V , , cl _ 7 

we can divide both the left-hand side and the right-hand side of (8.54) by 
(1 — a) )|, to obtain: 

—ak t+ i + co\ t +i — co\ t =—a t+ i, (8.55) 

where: 

_ k 7 

y]3(l-a)' 

Note that in equation (8.55), all endogenous state and costate variables are 
grouped on the left-hand side, while the exogenous state variable is isolated on 
the right-hand side. 

10 There is an easier way to get ():take logs of (), add and subtract In (A) from thwe left-hand 
side of the result, and consider that In (A) = — juh 1 ! 0 ) . Note thet this approachis feasible only 
because () is aleady log-linear. Note furthermorethat condition is not an approximation, but 
simply a transformation. 
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Conditions (8.47) and (8.48) are log-linearly approximated by: 


L P\ , 
V v) 


Pt +1 + Pt + —\+l — Af — 7} t , 
V 


(8.56) 


1 + r 


h+i + X t+1 - X t = 0. 


(8.57) 


Finally, conditions (8.49) and (8.50) are log-linearly approximated by : 


1-h+i - 
V 


(1 — 5) —hl-a 

y 


kt — —Ct + cit 

y 


(8.58) 


lh+\ ~ (1 + r) b t + (1 - i]) (fp t - rr t = -r](prj t (8.59) 

where ip = ( m/y ) ( y/b ). 

The exogenous processes for technology and money can be re-written as 

a t+i = P a t + e t (8.60) 

Vt+i = CVt +e t . (8.61) 


8.7.2 The linearized system 

The linearized first order conditions and the two exogenous processes for money 
and technology form a system of eight dynamic equations for which we have to 
find a solution: 


p 


1 


7] 


7 ^' +1 


lh+i 


- 1 


pet + At 
—akt+i + wAf+i — wAf 

Pt+i + Pt + ~At+i — A t 
V 

r* ^ ^ 

r t+ i + At+i - A t 


1 + r 

(1 — 5) —b 1 — a 

y 


kt H—Ct 

y 


(1 + r) b t + (1 - 7?) (ppt - rr t 
a-t+i 
Vt+i 


0 , 


—at+ij 


Vt, 


0., 


5 


-WVti 
P a t + e “, 

C? 7 t+C- 


(8.62) 
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Substituting for A t in terms of c t from the first equation, we re-write (8.62) 


H\l- 


P 


- 1 


—akt+i — ujpc t+ i + copct — —CLt+ i, 

^ P ^ ^ ^ 

Pt+i + Pt - pct+i + pc t — rj t , 

V 

r* 

O+i — pct+i + pet = 0, 


1 + r 


7^, +1 - 


(1 — 5) —hi — a 

y 


kt H —Ct — at, 

y 


7 & t+ i - (1 + r) b t + (1 - rj) ippt - r? t = —rjiprj t , 

at+i = pat + e a t , 

^t+i = + *?■ 


(8.63) 


Now by defining s t = 


kt 


(8.63) as: 


bt\Pt\h\ c t 


and e t = [a t | rj t \, we can rewrite 


M°s t+1 = M% + M°e t+1 + M*e t , (8.64) 

et+i = Pet, 

where: 
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—C Ofl 
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0 
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1 y 
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0 


_ 0 

7 

0 

0 
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-fl 
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0 
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-fi 
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Under our certainty equivalence assumption, randomness can be reintroduced by 
simply taking the conditional expectation of (8.64): 
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M ° a E t (s t+1 ) = Mg St + M ° e E t (e t+1 ) + M 1 ^, (8.65) 

®t+i = Pet- 

As E t e t +1 = Pe t , (8.65) becomes 11 : 

Et ( s t+i) = t + Ae t , (8.66) 

W = (Mg) _1 M 1 , 

A = (Mg) _1 (M°P + M 1 ) , 

(8.66) is a linear system of expectational difference equations. We solve it by 
applying the algorithm proposed by Blanchard and Kahn. 


8.7.3 The Blanchard-Kahn algorithm, 


If P is the modal matrix of W and fj, its canonical form (with the eigenvalues on 
the diagonal ordered in ascending absolute value), and if P is invertible, we may 
decompose W as W = P/xP _1 . We partition the vector of endogenous state 


variables as s' t 


[srt 


S 2 t] , where = 


kt I b t 


contains the backward-looking 


/ 

variables, and s' 2t = [pt | O | Ct] the forward-looking ones. Let us furthermore 
partition the matrices W, fi , P 1 and A as: 


w n w 12 

W 2 1 w 22 

,p _1 = 

qrr qr 2 
_q2r q22_ 

= 

Mi 0 
. 0 M2. 

; A = 

a l 

a 2 


(8.67) 


The dynamics of (8.66) is governed by the eigenvalues of W. In order to have 
a unique initial vector of values for the forward-looking variables compatible with 
the transversality conditions we need the system to be saddle-point stable. Such 
result is achieved when the first two eigenvalues of W are stable (strictly less 
than one in absolute value) and the last three unstable. 

Pre-multiplying (8.66) by P _1 , we can transform the original system in a 
new one containing two decoupled vectors of difference equations: 


E t (z t+ i) = /xz t + Be t , (8.68) 

where z t = P _1 St and B = P X A. 

The backward-looking sub-system is equivalent to: 


E t (zi t+ i) = + bie t , (8.69) 

/ 

where bi is implicitly defined by B = [tq | tq] . Since the eigenvalues in are 
less than one in absolute value, (8.69) is stable in the forward direction; further¬ 
more, since is predetermined, the initial conditions completely determine its 
solution. 

-^under the maintained hypothesis that is invertible 
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Conversely, the forward-looking sub-system: 

E t ( 2 , 21 + 1 ) = ^ 2 z 2t + b 2 e t , (8.70) 

is stable in the backward direction, since the elements of fj, 2 exceed one in ab¬ 
solute value. Thus it is necessary to impose a terminal rather than an initial 
condition. 

Rewrite (8.70) as: 


z 2t = fi 2 1 E t (z 2t+ i) - fi 2 1 b 2 e t = di E t (z 2t+ i) + d 2 e t . (8.71) 

Applying the recursive substitution method described in Sargent(1987) we solve 
(8.71) as: 


z 2t = - 2 d^d 2 A t (e t+fc ) = - 2 d^d 2 P fe e t = L ee e t . (8.72) 

k=0 k=0 


Applying the VEC operator to L ee we derive L ee ,as the value of L ee when k 
goes to infinite: 

Xe = -(/-P'®d 1 ) _1 d 2 . (8.73) 

By construction: 

zit = qn’s'it + qr 2 S 2 t, (8.74) 

z 2 t = q 2 r s rt + q 22 s 2 t- 

We can solve the second expression in (8.74) for s 2t : 

S 2t = q 2 2 1 Z2 1 - q 2 2 1 q2rSrt, (8.75) 

or: 

s 2t = Ls s rt + Le e t = (8.76) 

where L e = q^Lee, L s = -q 22 1 q 21 , L„ = [L s | L e ], and v t = [s lt | e t ]' . 

For the first two equations of the original system (8.66) given by: 


Et (sk+i) — WnS lt + w 12 s 2t + a^. (8.77) 

We that is lt is predetermined in the Blanchard-Kahn sense (expectational error 
equal to zero), and we can rewrite (8.77) as: 


s rt+r — w rr s rt + w r 2 s 2 t + a r e t- 


(8.78) 


By substituting (8.76) into (8.78) we have: 
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sit+i = (wu + Wi 2 L s ) sit + (wi 2 L e +ai) e t . (8.79) 


Combining (8.79) with the stochastic processes for money and technology we 
have: 


where: 


u t +i = M v v t + u t , 




Vt 


(wn + Wi 2 L s ) (w i2 L e + ai) 


0 P 


s l£ 


' f a ' 
fc £ 

et _ 

) e t — 



0 




(8.80) 


(8.81) 

(8.82) 


System (8.80) describes a first-order vector autoregression; iterating on it 
and taking (8.76) into account we recover the sequence of probability distribu¬ 
tions that represents the (approximated) solution to our non-linear system of 
stochastic difference equations. 

Having found a solution, there are two more variables of interest we would 
like to track, namely output and investment 12 , defined as: 


yt = o t kl~ a , (8.83) 

it — 


Log-linearization of (8.83) delivers: 


y t = a t + (l-a)k t , (8.84) 

H = -at + (1 - a) -k t - -c t . 

I l l 

And the path for output and investment is immediately available from the solu¬ 
tion of our system. 


8.8 Implementation 

The procedure described in the previous section can be implemented using MAT- 
LAB matrix programming language. We provide procedures for the solution of a 
more general problem with respect to the one we have described in the previous 
section. In fact we consider our model as a special case of a generic forward- 
looking system in which variables can be organized as controls, endogenous states 
costates and exogenous states and the Blanchard-Kahn methodology is applica¬ 
ble. The implementation requires the files main.m, kpr.m, and bk.m. 

The file main.m contains all the building blocks needed to run the KPR 
procedure. 

12 Note that real wage is obtained by considering a multiple of output. The dynamic properties 
of output and real wages coincide, and we do not study real wage separately. 
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% PART 1 


name=[’k ’;’b ’;’a ’;’e 
kp=l; bp=2; ap=3; ep=4; 
nc=l; ns=2; nl=3; nn=2; 


’ ; ’ c ’ ; ’y ’;’i VP Vr ’]; 

cp=5; yp=6; ip=7; pp=8; rp=9; lp=10; 
nxf=2; nvar=ns+nl+nc+nn+nxf; nt=nl+ns; 


% PART 2 

mu=2; % risk aversion 

sn=0.6; % alpha 

sc=0.75; % c/y share 

g=1.004; % growth rate 

rky=13.28; % capital-output ratio 

rmy=5.3; % money income velocity 

rby=0.6; % bonds-output ratio 

eta=1.013; % money growth rate 

rho=[0.95 0;0 0.49]; % persistence of shocks 

covar=zeros(ns+nn); 

covar(ns+1:ns+nn,ns+1:ns+nn)=[0.007,0;0,0.009] ; 


% PART 3 
sk=l-sn; % 1-alpha 
si=l-sc; % i/y share 
d=l-g+si/rky; % delta 
be=g/(sk/rky+l-d); % beta tilde 
r=(g-be)/be; % real int. rate 

theta=((eta-be)/be)*(rmy/sc)"mu; % pref. param. 
capd=g/be*rky/sk; % capital delta 
vphi=(l-be)/((eta-l)*g*be); % var. phi 
rby=(eta-l)*rmy*((g*be)/(1-be)); % b/y ratio 
% PART 4 
muu=-mu; 

mus=[0,0,0,0,1]; 
mue=[0,0] ; 

ms0=[-sn,0,0,0,vpi; 0,0,mu*(l-be/eta)-l,0,be/eta; 0,0,0,r/(1+r),1;... 
g*rky,0,0,0,0; 0,g,0,0,0]; 

msl=[0,0,0,0,-vpi; 0,0,1,0,-1; 0,0,0,0,-1; -(l-d)*rky-sk,0,0,0,0;... 

0,-g/be,(1-eta)*vphi,-r,0]; 
mu0=[0;0;0;0;0] ; 
mul=[0;0;0;-sc;0] ; 
me0= [-1,0; 0,0; 0,0; 0,0; 0,0] ; 
mel=[0,0;0,1;0,0;1,0;0,-eta*vphi] ; 
fvu=[0;-sc/si] ; 

fvv=[sk,0,1,0;sk/si,0,1/si,0] ; 
f vl= [0,0,0; 0,0,0] ; 

PART 1 of the program defines a vector name contains the definition of vari¬ 
ables, it then specifies the position of each variable in the system. Variables 
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are then divided into controls, endogenous states, costates,exogenous states and 
other variables of interest, nc is the number of controls, ns the number of en¬ 
dogenous states, nl the number of costates, nn the number of exogenous states, 
and nxf the number of variables of interest. Finally, it stores the total number 
of variables in the system, nvar, and the total number of endogenous state and 
costate variables, nt. 

Recall our system: 

0, 

~ a t+ 1 ) 

Vt, 

0., 

^tl 

-WVt, 
pa t + e“, 

CVt+*?- 


u t = [c t ] is the vector containing the only control, s t = | k t b t p t r t A t J is 

^ -| / 

the vector containing the subvector lsi t = k t b t of the two endogenous states 

and the subvector S 2 t = Pt r t A t of the three co-states, lastly e t = [a t rj t ] 
is the vector containing the exogenous states. In addition we define a vector 
f t = \jj t i t ] ,with the two other variables of interest. 

PART 2 of the program stores the complete parameterization used in our 
exercise, and creates the covariance matrix, denoted covar, of the vector of in¬ 
novations u t in (8.80). PART 3 solves for the steady-state and obtains the value 
of all calibrated parameters. Finally, PART 4 stores the matrices that describe 
the linearized system that will be solved by the KPR procedure. Note that we 
can re-write the first equation of our system as: 

M uu u t = M us s t +M ue e t , (8.85) 

where M VLU = [-p], M us = [0 | 0 | 0 | 0 | 1] „ and M ue = [0 | 0] . 

The equations for the variables contained in the vector s t can then be written 
as: 

(M" s + M l s L) s t+1 = (M“ b + M^) u t+ 1 + (M° e + M \ e L) e t+1 


pet + A( — 
—ak t+ i + u)\ t+ i — co\ t = 

1 — — | — 1 Pt +1 + Pt + ~ V+i — At = 

V J \ V 

r ^ ^ ^ 

—— r t+ 1 + A t+ i - A t = 
1 + r 

7 — k t +\ — (1 — 5) —hi — a k t - 1 —ct = 

v Y v \ v 

7 b t+i ~ (1 + r) b t + (1 - r]) ipp t - rr t = 

a-t+i = 

Vt+i = 



( 8 . 86 ) 
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where L is the lag operator and : 
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As far as the other variables of interest are concerned we have: 

ft = F~V U u t + F 'V v lSt + FV;S2t, 


where f t = y t i t and: 


FV 


U 


, FV,, 


l-a 010 
(l-a)fOfO 


, FV; 


000 

000 


At this stage the KPR procedure can be implemented by calling two 
external functions. The first one is kpr.m 

function [h,mv]=kpr(muu,mus,mue,mu0,mul,msO,msl,meO,mel,fvu,fvv,fvl,rho); 

nl=size(fvl,2); nt=size(mus,2); ns=nt-nl; 

qus=muu\mus; 

que=muu\mue; 

qusl=qus(:,1:ns); 

qus2=qus(:,ns+l:nt); 

msss0=ms0-mu0*qus; 

msssl=msl-mul*qus; 

msse0=me0+mu0*que; 

mssel=mel+mul*que; 

w=-msss0\msssl; 
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a=msssO\(msseO*rho+mssel); 

[lv,mv]=bk(w,a,rho,ns); 
uv=[[qusl,que]+qus2*lv]; 
fv=f 1 vu*uv+fvv+fvl*lv; 
h=[uv;fv;lv]; 
end; 

The KPR functions,assumes M uu invertible and solves (8.85) for u t \ 


tit — Qus&t T Que^t: (8.87) 

where Q us = M~^ M us and Q ue = M~^ M ue . Using (8.87) at date t + 1 into 
(8.86) we get: 

(M° s + M) S L) s t+ i = (M“ b + M) b L) (Q us s t+ i + Q ue e t+ i) 

+ (M° e + Mg e L) e t+ i. 

Rearranging terms, we have: 

(M° + M \L) s t+1 = (M“ + M \L) e t+1 , 

where M“ = M“ s - M° SU Q US , M J = Mj s - M“ = M» e + M° u Q ue , 

Mg = Mj e +.Mj„Q„ e . 

If is invertible, we can solve for St+i, obtaining: 


St+i — Ws t + Ret+i + Qe t , (8.88) 

where W = - (M^f 1 Mj,, R = (M° s )^ M° e , and Q =.(M° s )y 1 Mj e . 

Under our certainty equivalence assumption, randomness is reintroduced by 
simply taking the conditional expectation of (8.88): 

Et ( s t+i) = Ws ( + R E t (et+i) + Qe t , (8.89) 

As Et (et+i) = Pe t , (8.89) becomes: 

Et ( s t+i) = Ws t + Ae t , 


where A = RP + Q. 

We are now ready to call the bk.m function to apply the Blanchard-Kahn 
solution algorithm: 

function [lv,mv]=bk(w,a,rho,ns) 
nt=size(w,1); nl=nt-ns; nn=size(rho,1); 
wll=w(l:ns,1:ns); 
wl2=w(l:ns,ns+l:nt); 

[evec,eval]=eig(w); 

[mul,ind]=sort(abs(diag(eval))); 
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mu=diag(eval); 
mu=mu(ind); 
p=evec(:,ind); 
mu2=diag(mu(ns+l:nt)); 
ps=p\eye(size(p)); 
b=ps*a; 

dl=mu2\eye(size(mu2)); 
d2=dl*b(ns+l:nt,:); 

lee=-(eye(nl*nn)-kron(rho’,dl))\d2(:); 
lee=reshape(lee,nl,nn); 

ls=-ps(ns+1:nt,ns+l:nt)\ps(ns+1:nt,1:ns); 
le=ps(ns+1:nt,ns+l:nt)\lee; 
lv=[Is,le]; 

mv=[wll+wl2*ls,wl2*le+a(l:ns,:);zeros(nn,ns),rho]; 
end; 

We have retrieved all matrices that characterize the approximated solution. 

8.9 Model Evaluation 

In Chapter 6 we have introduced VAR models of the monetary transmission 
mechanism as the statistical framework to produce stylized facts for the evalua¬ 
tion of theoretical models. We have stressed the importance of identifying model 
via theory-independent restrictions and we have illustrated how stylized facts 
on the monetary transmission mechanism can be described by impulse response 
functions. We have also emphasized that the derivation of the responses of vari¬ 
ables included in VAR models to unexpected monetary shocks is not meant to 
be a policy experiment but rather a benchmark against which assess the per¬ 
formance of theoretical models. Our solution of the theoretical model discussed 
in this chapter delivers us a VAR on which theory-based parameters restrictions 
are imposed. We can then use the comparison of impulse responses derived from 
the theory-independent VAR and from the solution of our theoretical model as a 
model evaluation device. Corroboration of the theoretical model is achieved when 
the responses of variables to shocks in the theoretical model match the stylized 
facts derived within the empirical VAR. If the theoretical model performs consis¬ 
tently with the data, then it can be used for policy analysis. Policy analysis can 
be validly performed by considering experiments in the modification of system¬ 
atic and non-systematic components of variables. The micro-foundation of the 
model and the separation of deep parameters describing taste and technology 
from expectational parameters guarantees the robustness of policy experiment 
against the Lucas critique. 

Therefore we proceed to model evaluation by considering that, in the light 
of the VAR based evidence, a theoretical model of the monetary transmission 
mechanism should be consistent at least with three stylized facts: i) after a 
monetary policy shock, the aggregate price level initial reaction is extremely 
limited; ii) the interest rate initially increases; iii) aggregate output initially 
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falls, but then recovers, so that the long run effect of a monetary shock on this 
variable is zero. 

The first experiment we would like to perform with our simplified mone¬ 
tary model economy is to compute the model’s impulse response functions and 
compare them with the impulse response functions of an estimated VAR. The 
MATLAB file performing this task is called impulse, m. 

clear; main; 
n=61; t=l:n-l; pr=0; 
shockto=[’A ’;’Eta’]; 
sim=zeros(nvar,n); 

[h,mv]=kpr(muu,mus,mue,muO,mul,msO,ms1,meO,me1,f vu,f vv,fvl,rho); 
for i=l:2; 

sck=zeros(nn+ns,n+l); 

sck(i+2,1)=1; 

s=sck(:,1); 

for j=l:n; 

sim(:,j)=[s;h*s] ; 

s=mv*s+sck(:,j+1); 

end; 

for j=l:n-l; 

pi(j)=sim(ep,j)+sim(pp,j+l)-sim(pp,j); 
end; 

sim=sim(:,1:n-l); 

subplot(2,2,1), hnd=plot(t,sim(yp,:)’,’k-’,t,sim(cp,:)’,’k-.’,t,sim(ip,:)’,’k:’); 

legend(hnd j’y’j’c’j’i’jl); 

title([’Shock to ’ shockto(i, :)] ); 

ylabel(’% Deviation’); 

subplot(2,2,2), hnd=plot(t,sim(kp,:)’,’k-’,t,sim(bp,:)’,’k:’); 
legend(hnd,’k’,’b’,l); 

subplot(2,2,3), hnd=plot(t,sim(pp,:)’,’k-’,t,pi,’k:’); 
legend(hnd,’p’,’pi’,1); 

subplot(2,2,4), hnd=plot(t,sim(rp,:)’,’k—’); 
legend(hnd,’r’,1) ; 
if pr==0; pause; else 

eval([’print -dbitmap fig’ num2str(i)]); 

end; 

end; 

The procedure begins by clearing the workspace and recalling the main.m 
file. Then, we choose a sixty quarters simulation horizon and create a vector t 
that will act as a time index, and set to zero a particular flag, called pr, that 
controls the display procedure (if pr is set to one, the figures are plotted on 
the screen and saved on disk as bitmaps). Finally, we define the vector shockto 
containing the names of the state variables we are going to shock, and allocate 
some memory in advance for the matrix sim, which will contain the simulated 
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series, in order to speed up computation. 

Once the initial steps have been completed, we recall the file kpr.m to solve 
our log-linearized system. The solution is summarized by the matrices mv and h. 
We proceed to shock our two exogenous state variables: TFP, and the growth 
rate of nominal money balances. We start a loop, which is repeated twice, and 
build a matrix of innovations, sck; the matrix sck is a matrix of zeros, with only 
one strictly positive element, corresponding to the initial shock to either TFP 
or money growth. Then, we interactively simulate the linearized system, recover 
the inflation rate, and adjust the sample end-point. We are now ready to plot 
the results, and save the figures as bitmaps if the flag pr is set to one. These 
bitmaps are showed in Figures 8.3 and 8.4. 

Figure 8.3 shows the model’s reaction to a positive shock to TFP. 


Shock to A 






Fig. 8.3. Figure 8.3: the effect of a shock to Total Factor Productivity in our 
model economy 

The unexpected increase in TFP induces a parallel increase in current out¬ 
put, since the current capital stock is fixed. As the productivity shock is highly 
persistent, TFP converges slowly to its steady-state level. Also consumption and 
investment increase on impact, but consumption’s reaction is less pronounced, 
since the representative agent wants to smooth consumption over time; further- 
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more, the consumption path’s slope increases on impact, since the rate of return 
on physical capital, which depends on its marginal productivity, is higher than 
in steady state. The accumulation of physical capital and the decrease in TFP 
jointly drive down the rate of return, back to its steady-state level and further 
down. At this point, being the rate of return lower than in the steady-state, the 
slope of the consumption path becomes negative. Since consumption and invest¬ 
ment are jointly higher than output, the capital stock is eaten up. The marginal 
productivity of capital increases and converges back to its steady state level. Con¬ 
sumption, output, investment, and capital converge slowly to their steady-state 
levels. 

Consider now the nominal variables and the stock of government bonds. On 
impact, a positive income effect increases the demand for real money balances, 
while the nominal money supply still grows at the steady-state rate. The price 
level has to decrease on impact to balance demand and supply of real money 
balances. The further dynamics of the price level is driven by the dynamics of 
the shadow price of installed physical capital, ie. by the costate variable. Also 
the value of the interest rate on government bonds is fully determined by the 
shadow price of capital from date 1 onwards. To balance the government budget 
constraint, the stock of government bonds has to follow a particular path from 
date 1 onwards, since both the price level and the interest rate depend only on the 
costate variable. At date 0, the interest rate on government bonds has to balance 
the government budget constraint for the price level and the required level of 
investment in government bonds. From date 1 onwards, then, the dynamics of 
the nominal variables and of the stock of government bonds is fully determined 
by the path of the shadow value of physical capital. 

Figure 8.4, instead, shows the model’s reaction to a positive shock to the 
growth rate of the nominal money balances. 

As we can see, the real side does not absolutely react to monetary shocks. 
Since the demand for money balances remain unchanged, the sudden increase in 
the nominal money supply has to be counterbalanced by a sharp increase in the 
price level. From date 1 onwards, the interest rate on government bonds returns 
to its steady-state value, since the shadow price of capital remains constant; 
the stock of government bonds, then, has to balance the government budget 
constraint for the given price level, whose dynamics depend only on the dynamics 
of the growth rate of the nominal money stock. At date 0, the interest rate has 
to jump sharply in order to balance the government budget constraint for the 
required level of investment in government bonds and the price level. 

Impulse response functions summarized in Figure 8.4 are in sharp contrast 
with the stylized facts proposed by the VAR approach. We may compare Figure 
8.4 with, for instance, Figure 6.4. In the VAR model, the effect of a monetary 
shock on output is nearly zero on impact, but then clearly positive in the short- 
run, and again zero in the long-run. The effect on the price level is extremely 
small on impact, but increasing over time. The same apply to the interest rate, 
whose reaction to a monetary shock is small on impact but then increases over 
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Fig. 8.4. Figure 8.4: the effect of a shock to money supply in our model economy 


time. In our model, the output level simply does not react to monetary shocks, 
while both the price level and the interest rate increase sharply on impact and 
then return to their steady-state levels. 

Evidently enough, the model is completely unable to reproduce the stylized 
facts regarding the dynamic relationship between the output level, the price in¬ 
dex, and the interest rate on government bonds. This result is, however, not 
surprising. In our framework, all kinds of friction or market imperfections are 
ruled out, and the role of money is limited, since it simply reduces the trans¬ 
actions costs associate with shopping. The model’s equilibrium outcome, then, 
should be more appropriately considered an approximation of the long-run be¬ 
havior of the US economy. If this is the case, however, the model at hand is not 
the right one to study the cyclical properties of the US nominal variables, and, 
in particular, the transmission mechanism of monetary policy. 

To reinforce these conclusions, we briefly examine the small sample stochas¬ 
tic properties of our model, performing some Monte Carlo experiments. In other 
words, we draw from a random number generator a finite sequence of innova¬ 
tions, corresponding to a 100 quarters simulation horizon, and iterate to derive 
the simulate series for all exogenous and endogenous variables. To isolate the 
dynamics at business cycles frequencies, we filter the simulated series applying 
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the so called Hodrick-Prescott (H-P) filter, with a smoothing parameter equal 
to 1600. Then, we calculate the statistics of interests, as the relative standard 
deviation of each variable with regard to output, the autocorrelation coefficient, 
and the correlation coefficient with output. We repeat this procedure for at least 
1000 times, storing each round the results in a matrix. Finally, we summarize 
the empirical distribution of our statistics of interest calculating their mean, 
standard deviation, and so on, across the 1000 replications. 

The MATLAB program that performs these experiments is simulate, m. 

%PART 1 

clear; main; 

nexp=1000; n=101; 

vl=[cp;ip;pp;rp;kp;bp;ap;ep] ; 

v2=[yp;vl]; 

m_std=zeros(size(vl,1),nexp); m_cor=zeros(size(vl,1),nexp); 
m_auc=zeros(size(v2,1),nexp); inf_st=zeros(3,nexp); 
p_st=zeros(3,nexp); sim=zeros(nvar,n); inf=zeros(n,1); 

[h,mv]=kpr(muu,mus,mue,muO,mul,msO,ms1,meO,me1,f vu,f vv,fvl,rho); 
rn=l; 

while rn<=nexp 

sck=covar*randn(ns+nn,n+l); s=sck(:,1); 
for j=l:n; 

sim(:,j)=[s;h*s]; s=mv*s+sck(:,j+1); 
end; 

for j=l:n-l; 

inf(j)=sim(ep,j)+sim(pp,j+l)-sim(pp,j); 
end; 

sim=sim(:,1:n-l); inf=inf(1:n-l); 

sim_hp=hpf(sim’,1600); inf_hp=hpf(inf,1600); 

m_std(:,rn)=(std(sim_hp(:,vl))/std(sim_hp(:,yp)))’; 

m_auc(:,rn)=acor(sim_hp(:,v2),1)’; 

x=corrcoef(sim_hp); 

m_cor(:,rn)=x(vl,yp); 

inf_st(l,rn)=std(inf_hp)/std(sim_hp(:,yp)); 

inf_st(2,rn)=acor(inf_hp,1)’; 

x=corrcoef(inf_hp,sim_hp(:,yp)); 

inf_st(3,rn)=x(l,2); 

rn=rn+l; 

end; 

%PART 2 

stdm(:,l)=mean(m_std’)’; stdm(:,2)=median(m_std’)’; 
stdm(:,3)=std(m_std’)’; stdm(:,4)=min(m_std’)’; 
stdm(:,5)=max(m_std’)’; 

au(:,l)=mean(m_auc’)’; au(:,2)=median(m_auc’)’; 
au(:,3)=std(m_auc’)’; au(:,4)=min(m_auc’)’; 
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au(:,5)=max(m_auc’) ’ ; 

cor(:,l)=mean(m_cor’) ’ ; cor(:,2)=median(m_cor’) ’ ; 
cor(:,3)=std(m_cor’)’; cor(:,4)=min(m_cor’)’; 
cor(:,5)=max(m_cor’)’; 
for j =1:3 

st_inf ( j , l)=mean(inf _st ( j st_inf ( j , 2)=median(inf _st ( j 

st_inf(j,3)=std(inf_st(j,:)); st_inf(j,4)=min(inf_st(j,:)); 
st_inf(j,5)=max(inf_st(j,:)); 
end; 

%PART 3 

delete output.txt; 
diary output.txt; 
t=’ ’; clc; 

disp([’Performed simulations: ’ num2str(nexp)]); 

disp(’ ’); disp(’ ’); disp(’Relative Standard deviations:’); disp(’ 

’); 

tit=’Var. Avg Med Std Min Max’; 
f=’%3.2f\t’; disp(tit); disp(’ ’); 

disp([name(vl,:) t(ones(size(vl,1),1)) num2str(stdm,f)]); 
disp([’PI ’ t num2str(st_inf(1,:),f)]); 

disp(’ ’); disp(’Autocorrelations:’); disp(’ ’); disp(tit); disp(’ 

’); 

disp([name(v2,:) t(ones(size(v2,1),1),:) num2str(au,f)]); 
disp([’PI ’ t num2str(st_inf(2,:),f)]); disp(’ ’); 
disp(’Correlations with Output:’); 
disp(’ ’); disp(tit); disp(’ ’); 

disp([name(vl,:) t(ones(size(vl,1),1),:) num2str(cor,f)]); 
disp([’PI ’ t num2str(st_inf(3,:),f)]); disp(’ ’); 
diary off; 

In PART 1 we start by clearing the workspace and recalling the file main.m. 
Then, we define the number of rounds, nexp, and the simulation horizon, n (we 
need a further quarter to recover the inflation rate). We create a vector vl that 
identifies the variables for which we want to calculate the relative volatility and 
the correlation with output, and a vector v2 that identifies the variables for 
which we want to calculate the autocorrelation coefficient. Finally, we allocate 
memory for the matrices containing the results of our experiments. 

We are now ready to recall the KPR solution procedure and start our sequence 
of experi-ments. We create a matrix of normally distributed innovations, denoted 
sck, and simulate interactively the system. Then, we recover the inflation rate 
and adjust the sample endpoint. Finally, the simulated series are H-P filtered (by 
recalling the Hpf.m procedure), to isolate the business cycle frequencies. We can 
now calculate the statistics of interest, and store them in the matrices previously 
defined. The whole procedure is repeated nexp times. 

In PART 2 of the program, we describe the empirical distribution of the 
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statistics of interest, storing the average, the standard deviation, the maximum 
and the minimum. 

In the PART 3 we print on the screen and save as an ASCII file the results 
of our Montecarlo experiments. 

Results are summarized in Tables 8.1-8.3. In Table 8.1, we report the stochas¬ 
tic properties of the model under our benchmark parameterization, when both 
exogenous state variables, TFP and the growth rate of nominal money balances, 
are hit by random shocks. 


TABLE 8.1: Stochastic properties (benchmark parameterization) 


Var. 

Vol. 

Std 

Auto 

Std 

Cor. 

Std 

V 



0.67 

0.06 



c 

0.32 

0.01 

0.71 

0.08 

0.97 

0.01 

i 

3.08 

0.01 

0.67 

0.08 

1.00 

0.00 

k 

0.20 

0.03 

0.93 

0.03 

-0.01 

0.06 

b 

0.95 

0.12 

0.43 

0.09 

0.27 

0.13 

a 

1.00 

0.00 

0.67 

0.08 

1.00 

0.00 

7] 

1.06 

0.17 

0.35 

0.09 

0.00 

0.14 

P 

2.00 

0.31 

0.35 

0.09 

-0.13 

0.13 

IT 

1.87 

0.29 

-0.06 

0.09 

0.02 

0.10 


The real side of the model perform as the standard stochastic Brock-Mirman 
model, being able to reproduce quite well the main features of the US real busi¬ 
ness cycle. Note that the price level is more volatile than output, it is slightly 
autocorrelated, and it is negatively correlated with output. Furthermore, note 
that the inflation rate is uncorrelated with both its own past values and with 
current output. Table 8.2, instead, shows the model’s properties when only real 
shocks are present. 


TABLE 8.2: Stochastic properties (only real shocks) 


Var. 

Vol. 

Std 

Auto 

Std 

Cor. 

Std 

V 



0.68 

0.08 



c 

0.32 

0.01 

0.71 

0.08 

0.97 

0.01 

i 

3.08 

0.01 

0.67 

0.08 

1.00 

0.00 

k 

0.20 

0.03 

0.93 

0.03 

-0.01 

0.07 

b 

0.44 

0.01 

0.72 

0.08 

0.58 

0.07 

a 

1.00 

0.00 

0.67 

0.08 

1.00 

0.00 

7] 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

P 

0.27 

0.01 

0.74 

0.07 

-0.92 

0.02 

IT 

0.19 

0.02 

-0.05 

0.10 

0.24 

0.08 


As we can see, the properties of the real variables, except the stock of gov¬ 
ernment bonds, is left unchanged, while the behavior of the nominal variables 
change dramatically. The relative volatility of the price level drops sharply, its 
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autocorrelation increases, and its correlation with output rexches almost minus 
one. The inflation rate becomes slightly positively correlated with output. Table 
8.3, finally, shows the model’s properties when only monetary policy shocks are 
present. 


TABLE 8.3: Stochastic properties (only monetary shocks) 


Var. 

Vol. 

Std 

Auto 

Std 

Cor. 

Std 

y 



0.00 

0.00 



c 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

i 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

k 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

b 

00 

0.00 

0.34 

0.09 

0.00 

0.00 

a 

0.00 

0.00 

0.00 

0.00 

0.00 

0.00 

7] 

00 

0.00 

0.34 

0.09 

0.00 

0.00 

P 

00 

0.01 

0.34 

0.09 

0.00 

0.00 

IT 

00 

0.02 

-0.06 

0.10 

0.00 

0.00 


Evidently, monetary policy shocks influence only the nominal variables, leaving 
the real side completely unaffected. 

8.10 Policy analysis 

The results discussed in the previous section make clear that the proposed model 
design cannot be used for monetary policy analysis. McCallum(1999) and McCal- 
lum and Nelson(1999b) perform policy analysis by using the following modified 
version of the model: 


Vt = E t y t+1 Et-wt+i) , (8.90) 

{l 

■n t = l3E t ir t+ i + 7 x y t - 7 2 a t , (8.91) 

at = pcit -1 + (8.92) 

it = P 0 + /Tfft-i + (1 — P 3 ) ((1 + Mi) E t irt +1 + R 2 Vt) + e T ; (8.93) 

(8.90) is the forward-looking IS equation consistent with the optimization prob¬ 
lem in our model economy, (8.91) is an aggregate supply equation, which intro¬ 
duces some degree of price stickiness to make it consistent with the stylized facts 
on the monetary transmission mechanism, (8.92) describes technological shocks 
exactly as in the model economy while (8.93) is a central bank reaction function, 
consistent with inflation targeting which substitutes the forward-looking LM 
equation. As the central bank targets interest rate the effect of the monetary 
policy can be evaluated the LM equation. In fact this relation only defines the 
quantity of money that the central bank has to supply in order to keep its target 
for the interest rate and it has no relevance for the analysis of monetary pol¬ 
icy. The solution of this new system delivers a VAR featuring impulse responses 
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consistent with the stylized facts from the data. Having corroborated the model 
by analyzing responses to shocks, it is possible to proceed to policy simulation. 
McCallum stresses the importance of analyzing systematic monetary policy and 
evaluates the impact on the system of different choices for the parameters fi 0 , 
/i 2 , /i 3 in the central bank’s reaction function. As we have already noted, this 
is perfectly compatible with the use of impulse responses to monetary policy 
shocks as a model evaluation device. The main limitation of this modified model 
is that it is not derived explicitly from an intertemporal optimization problem. 
In particular, we have seen how a reaction function such as (8.93) can be de¬ 
rived by solving an intertemporal optimization problem for a strict or a flexible 
inflation targeter. The functional specification of the (8.93) requires some prefer¬ 
ence for interest rates smoothing by the central bank. Moreover the parameters 
fi 0 , /i 2 , /i 3 describing the optimal response by the monetary policy maker to 
macroeconomic conditions are convolutions of central bank’s preferences and the 
parameters determining the structure of the economy. As a consequence, McCal- 
lum’s proposal for the simulation of systematic monetary policy could be further 
refined by esplicitly identifying central banker’s preferences to consider the im¬ 
pact of their modifications 13 . Comparison of optimal with sub-optimal monetary 
policies could also be interesting for the evaluation of the cost of sub-optimal 
policies. 


13 In the website associated with the book we make available an exercise aimed at showing 
how the optimal policy response is derived and what is the impact of uncertainty on the optimal 
policy response. Marco Aiolfi has kindly provided both the exercise, and the MATLAB code 
for the solution, which are made available in the files exlchap8.pdf and exlchap8.m 
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